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NOVEL PLANT ACYLTRANSFERASES 

INTRODUCTION 

This application claims the benefit of U.S. Provisional Application Serial No. 
60/101,939 filed September 25, 1998. 

Technical Field 

The present invention is directed to nucleic acid and amino acid sequences and 
constructs, and methods related thereto. 

Background 

Through the development of plant genetic engineering techniques, it is now possible to 
produce transgenic varieties of plant species to provide plants which have novel and desirable 
characteristics. For example, it is now possible to genetically engineer plants for tolerance to 
environmental stresses, such as resistance to pathogens and tolerance to herbicides and to 
improve the quality characteristics of the plant, for example improved fatty acid compositions. 
However, the number of useful nucleotide sequences for the engineering of such 
characteristics is thus far limited and the speed with which new useful nucleotide sequences 
for engineering new characteristics is slow. 

The characterization of various acyltransferase proteins is useful for the further study 
of plant fatty acid synthesis systems and for the development of novel and/or alternative oils 
sources. Studies of plant mechanisms may provide means to further enhance, control, 
modify, or otherwise alter the total fatty acyl composition of triglycerides and oils. 
Furthermore, the elucidation of the factor(s) critical to the natural production of fatty acids in 
plants is desired, including the purification of such factors and the characterization of 
element(s) and/or cofactors which enhance the efficiency of the system. Of particular interest 
are the nucleic acid sequences of genes encoding proteins which may be useful for 
applications in genetic engineering. 
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SUMMARY OF THE INVENTION 

5 The present invention provides nucleic acid encoding for amino acid 

sequences for a class of proteins which are related to acyltransferase proteins. Such proteins 
are referred to herein as acyltransferase related or acyltransferase like proteins, 
1 By this invention, nucleic acid sequences encoding these acyltransferase related 

\ proteins may now be characterized with respect to enzyme activity. In particular, 
10 identification and isolation of nucleic acid sequences encoding for acyltransferase related 
proteins from Arabidopsis, yeast, com, and soybean are provided. 

Thus, this invention encompasses acyltransferase related nucleic acid sequences and 
the corresponding amino acid sequences, and the use of these nucleic acid sequences in the 
preparation of oligonucleotides containing such acyltransferase related encoding sequences 
15 for analysis and recovery of plant acyltransferase related gene sequences. The acyltransferase 
related encoding sequence may encode a complete or partial sequence depending upon the 
intended use. All or a portion of the genomic sequence, or cDNA sequence, is intended. 

Of special interest are recombinant DNA constructs which provide for transcription or 
transcription and translation (expression) of the acyltransferase related sequences in host 

2 0 cells. In particular, constructs which are capable of transcription or transcription and 

translation in plant host cells are preferred. For some applications a reduction in sequences 
encoding acyltransferase related sequences may be desired. Thus, recombinant constructs 
may be designed having the acyltransferase related sequences in a reverse orientation for 
expression of an anti-sense sequence or use of co-suppression, also known as "transwitch", 
25 constructs may be useful. Such constructs may contain a variety of regulatory regions 

including transcriptional initiation regions obtained from genes preferentially expressed in 
plant seed tissue. For some uses, it may be desired to use the transcriptional and translational 
initiation regions of the acyltransferase related gene either with the acyltransferase related 
encoding sequence or to direct the transcription and translation of a heterologous sequence. 

3 0 Also considered in this invention are the plants and seeds containing the constructs 

and polynucleotides of this invention. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 provides the 204 amino acid conserved sequence profile identified, from 
comparisons of glycerol-3-phosphate acyltransferase and various lysophosphatidic acid 
acyltransferase using PSI-BLAST. 

Figure 2 provides an amino acid sequence alignment for the acyltransferase 
sequences. The alignment shown is of the regions of the protein extending from about 30 
amino acids prior to the conserved H in the conserved sequence HXXXXD to 100 amino 
acids after, or downstream, of the P in the conserved PEG sequence motif of the 
acyltransferase-like sequences. 

Figure 3 provides schematics showing the relationship of the identified 
acyltransferases. The relationships described are derived from an alignment of the regions of 
the protein extending from about 30 amino acids prior to the conserved H in the conserved 
sequence HXXXXD to 100 amino acids after, or downstream, of the P in the conserved PEG 
sequence motif of the acyltransferase-like sequences. Figure 3A provide aphylogenetic tree 
showing the relationship of several acyltransferases. Figure 3B provides a table showing the 
percent similarities and percent divergence of the novel acyltransferases and known 
acyltransferases using the Clustal method with PAM250 residue weight table. 



DETAILED DESCRIPTION OF THE INVENTION 

In accordance with the subject invention, nucleotide sequences are provided which are 
capable of coding sequences of amino acids, such as, a protein, polypeptide or peptide, which 
are related to nucleic acid sequences encoding acyltransferase proteins, referred to herein as 
acyltransferase-like or acyltransferase related. The novel nucleic acid sequences find use in 
the preparation of constructs to direct their expression in a host cell. Furthermore, the novel 
nucleic acid sequences may find use in the preparation of plant expression constructs to 
modify the fatty acid composition of a plant cell. 

In one embodiment of the present invention, nucleic acid sequences, also referred to 
herein as polynucleotides, are identified from databases which are related to acyltransferases. 
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Isolated proteins. Polypeptides and Polynucleotides 

A first aspect of the present invention relates to isolated acyltransferase 
polynucleotides. The polynucleotide sequences of the present invention include isolated 
polynucleotides that encode the polypeptides of the invention having a deduced amino acid 
sequence selected from the group of sequences set forth in the Sequence Listing and to other 
polynucleotide sequences closely related to such sequences and variants thereof. 

The invention provides a polynucleotide sequence identical over its entire length to 
each coding sequence as set forth in the Sequence Listing. The invention also provides the 
coding sequence for the mature polypeptide or a fragment thereof, as well as the coding 
sequence for the mature polypeptide or a fragment thereof in a reading frame with other 
coding sequences, such as those encoding a leader or secretory sequence, a pre-, pro-, or 
prepro- protein sequence. The polynucleotide can also include non-coding sequences, 
including for example, but not limited to, non-coding 5' and 3' sequences, such as the 
u-anscribed, untranslated sequences, termination signals, ribosome binding sites, sequences 
that stabilize mRNA, introns. polyadenylation signals, and additional coding sequence that 
encodes additional amino acids. For example, a marker sequence can be included to facilitate 
the purification of the fused polypeptide. Polynucleotides of the present invention also 
include polynucleotides comprising a structural gene and the naturally associated sequences 
that control gene expression. 

The invention also includes polynucleotides of the formula: 
X-(R,)„-(R2)-(R3)n-Y 

wherein, at the 5' end, X is hydrogen, and at the 3' end, Y is hydrogen or a metal, R, and R3 
ai^ any nucleic acid residue, n is an integer between 1 and 3000. preferably between 1 and 
1000 and R2 is a nucleic acid sequence of the invention, particularly a nucleic acid sequence 
selected from the group set forth in the Sequence LisUng and preferably SEQ IDNOs: 1, 3, 5. 
7, 9, 10, 12, 14. 16, 18. 20. 22, and 226-233. In the formula, R2 is oriented so that its 5' end 
residue is at the left, bound to R,. and its 3' end residue is at the right, bound to R3. Any 
stretch of nucleic acid residues denoted by either R group, where R is greater than 1, may be 
either a heteropolymer or a homopolymer, preferably aheteropolymer. 

The invention also relates to variants of the polynucleotides described herein that 
encode for variants of the polypeptides of the invention. Variants that are fragments of the 
polynucleotides of the invention can be used to synthesize full-length polynucleotides of the 
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invention. Preferred embodiments are polynucleotides encoding polypeptide variants wherein 
5 to 10. 1 to 5, 1 to 3, 2, 1 or no amino acid residues of a polypeptide. sequence of the 
invention are substituted, added or deleted, in any combination. Particularly prefeired are 
substitutions, additions, and deletions that are silent such that they do not alter the properties 
or activities of the polynucleotide or polypeptide. 

Nucleotide sequences encoding acyltransferases may be obtained from natural sources 
or be partially or wholly artificially synthesized. They may directly correspond to an 
acyltransferase endogenous to a natural source or contain modified amino acid sequences, 
such as sequences which have been mutated, truncated, increased or the like. Acyltransferases 
may be obtained by a variety of methods, including but not limited to. partial or homogenous 
purification of protein extracts, protein modeling, nucleic acid probes, antibody preparations 
and sequence comparisons. Typically an acylu-ansferase will be derived in whole or in part 
from a natural source. A natural source includes, but is not limited to, prokaryotic and 
eukaryotic sources, including, bacteria, yeasts, plants,. including algae, and the like. 

Of .special interest are acyltransferases which are obtainable from eukaryotic sources, 
including those which are obtained, from plants, or from acyltransferases which are 
obtainable through the use of these sequences. "Obtainable" refers to those acyltransferases 
which have sufficiently similar sequences to that of the sequences provided herein to provide 
a biologically active protein of the present invention. 

Further preferred embodiments of the invention that are at least 50%, 60%, or 70% 
identical over their entire length to a polynucleotide encoding a polypeptide of the invention, 
and polynucleotides that are complementary to such polynucleotides. More preferable are 
polynucleotides that comprise a region that is at least 80% idemical over its entire length to a 
polynucleotide encoding a polypeptide of the invention and polynucleotides that are 
complementary thereto. In this regard, polynucleotides at least 90% identical over their entire 
length are particulariy preferred, those at least 95% identical are especially preferred. Further, 
those with at least 97% identity are highly preferred and those with at least 98% and 99% 
identity are particulariy highly preferred, with those at least 99% being the most highly 
preferred. 

Preferred embodiments are polynucleotides that encode polypeptides that retain 
substantially the same biological function or activity as the mature polypeptides encoded by 
the polynucleotides set forth in the Sequence Listing. 
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The invention further relates to polynucleotides that hybridize to the above-described 
sequences. In particular, the invention relates to polynucleotides that hybridize under 
stringent conditions to the above-described polynucleotides. As used herein, the terms 
"stringent conditions" and "stringent hybridization conditions" mean that hybridisation will 
generally occur if there is at least 95% and preferably at least 97% identity between the 
sequences. An example of stringent hybridization conditions is overnight incubation at 42°C 
in a solution comprising 50% formamide, 5x SSC (150 mM NaCl, 15 mM.trisodium citrate). 
50 mM sodium phosphate (pH 7.6), 5x Denhardt's solution. 10% dextran sulfate, and 20 
micrograms/milliliter denatured, sheared salmon sperm DNA. followed by washing the 
hybridization support in 0. Ix SSC at approximately 65°C. Other hybridization and wash 
conditions are well known and are exemplified in Sambrook, et al.. Molecular Cloning: A 
Laboratory Manual, Second Edition, cold Spring Harbor, NY (1989). particularly Chapter 1 1 . 

The invention also provides a polynucleotide consisting essentially of a 
polynucleotide sequence obtainable by screening an appropriate library containing the 
complete gene for a polynucleotide sequence set for in the Sequence Listing under stringent 
hybridization conditions with a probe having the sequence of said polynucleotide sequence or 
a fragment thereof; and isolating said polynucleotide sequence. Fragments useful for 
obtaining such a polynucleotide include, for example, probes and primers as described herein. 

As discussed herein regarding polynucleotide assays of the invention, for example, 
polynucleotides of the invention can be used as a hybridization probe for RNA, cDNA, or 
genomic DNA to isolate full length cDNAs or genomic clones encoding a polypeptide and to 
isolate cDNA or genomic clones of other genes that have a high sequence similarity to a 
polynucleotide set forth in the Sequence Listing. Such probes will generally comprise at least 
15 bases. Preferably such probes will have at least 30 bases and can have at least 50 bases. 
Particularly preferred probes will have between 30 bases and 50 bases, inclusive. 

The coding region of each gene that comprises or is comprised by a polynucleotide 
sequence set forth in the Sequence Listing may be isolated by screening using a DNA 
sequence provided in the Sequence Listing to synthesize an oligonucleotide probe. A labeled 
oligonucleotide having a sequence complementary to that of a gene of the invention is then 
used to screen a library of cDNA. genomic DNA or mRNA to identify members of the library 
which hybridize to the probe. For example, synthetic oligonucleotides are prepared which 
correspond to the N-terminal sequence of the polypeptide. The partial sequences so prepared 
can then be used as probes to obtain acyltransferase clones from a gene library prepared from 
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a cell source of interest. Alternatively, where oligonucleotides of low degeneracy can be 
prepared from particular peptides, such probes may be used directly to screen gene libraries 
for gene sequences. In particular, screening of cDN A libraries in phage vectors is useful m 
such methods due to lower levels of background hybridization. 

Typically, a sequence obtainable from the use of nucleic acid probes will sho>y 60- 
70% sequence identity between the target acyltransferase sequence and the encoding sequence 
used as a probe. However, lengthy sequences with as little as 50-60% sequence identity may 
also be obtained. The nucleic acid probes may be a lengthy fragment of the nucleic acid 
sequence, or may also be a shorter, oligonucleotide probe. When longer nucleic acid 
fragments are employed as probes (greater than about 100 bp), one may screen at lower 
stringencies in order to obtain sequences from the target sample which have 20-50% 
deviation (i.e., 50-80% sequence homology) from the sequences used as probe. 
Oligonucleotide probes can be considerably shorter than the entire nucleic acid sequence 
encoding an acyltransferase enzyme, but should be at least about 10. preferably at least about 
15, and more preferably at least about 20 nucleotides. A higher degree of sequence identity is 
desired when shorter regions are used as opposed to longer regions. It may thus be desirable 
to identify regions of highly conserved amino acid sequence to design oligonucleotide probes 
for detecting and recovering other related genes. Shorter probes are often particularly useful 
for polymerase chain reactions (PGR), especially when highly conserved sequences can be 
identified. {See, Gould, et al., PNAS USA (1989) 56:1934-1938). 

The skilled artisan will appreciate that, in many cases, an isolated cDNA sequence 
will be incomplete, in that the region coding for the polypeptide is truncated with respect to 
the 5' terminus of the cDN A. This is a consequence of the reverse transcriptase, an enzyme 
with low 'processivity' (a measure of the ability of the enzyme to remain attached to the 
template during the polymerization reaction) employed during the first strand cDNA 
synthesis. 

There are several methods available and are well know to the skilled artisan to obtain 
full-length cDN As. or extend short cDN As, for example those based on the method of Rapid 
Amplification of cDNA Ends (RACE) (see, for example, Frohman et al. (1988) Proc. Natl. 
Acad. Sci. USA 85:8998-9002). Recent modifications of the technique, exemplified by the 
Marathon- technology (Clonetech Laboratories, Inc.) for example, have significantly 
simplified obtaining full-length cDNA sequences. 
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Another aspect of the present invention relates to isolated acyltransferase 
polypeptides. Such polypeptides include isolated polypeptides set forth in the Sequence 
Listing, as well as polypeptides and fragments thereof, particularly those polypeptides whtch 
exhibit acyltransferase activity and also those polypepUdes which have at least 50%, 60% or 
5 70% identity, preferably at least 80% identity, more preferably at least 90% identity, and most 
preferably at least 95% identity to a polypeptide sequence selected from the group of 
sequences set forth in the Sequence Listing, and also include portions of such polypeptides, 
wherein such portion of the polypeptide preferably includes at least 30 amino acids and more 
preferably includes at least 50 amino acids. 
10 "Identity", as is well understood in the art, is a relationship between two or more 

polypeptide sequences or two or more polynucleotide sequences, as determined by comparing 
the sequences. In the art, "identity" also means the degree of sequence relatedness between 
polypeptide or polynucleotide sequences, as determined by the match between strings of such 
sequences. "Identity" can be readily calculated by known methods including, but not hmited 
15 to those described in Computational Molecular Biology, Lesk, A.M., ed., Oxford University 
Press. New York (1988); Biocomputing: Informatics ar^d Ger^orr^e Projects, Smith. D.W., ed.. 
Academic Press. New York, 1993; Computer Analysis of Sequence Data. Part /, Griffin, 
A M. and Griffin, H.G.. eds.. Humana Press, New Jersey (1994); Sequence Analysis m 
Molecular Biology, von Heinje, G., Academic Press (1987); Sequence Analysis Primer, 
20 Gribskov. M. and Deveieux, J., eds.. Stockton Press, New York (1991); and CariUo, H.. and 
Lipman, D., SIAM J Applied Math, 48:1073 (1988). Methods to determine identity are 
designed to give the largest match between the sequences tested. Moreover, methods to 
determine identity are codified in publicly available programs. Computer programs which 
can be used to determine identity between two sequences include, but are not limited to. GCG 
25 (Devereux, J., et al.. Nucleic Acids Research 12(1):387 (1984); suite of five BLAST 
programs, three designed for nucleotide sequences queries (BLASTN, BLASTX. and 
TBLASTX) and two designed for protein sequence queries (BLASTP and TBLASTN) 
(Coulson, Trends in Biotechnology, 12: 76-80 (1994); Birren. et al.. Genome Analysis. J: 
543-559 (1997)) The BLAST X program is publicly available from NCBI and other sources 
30 iBLAST Manual, Altschul, S.. NCBI NLM NIH.Bethesda. MD 20894; Altschul. S., et 
al., J. Mol. Biol, 215:403-410 (1990)). The well known Smith Waterman algorithm can also 
be used lo determine identity. 
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Parameters for polypeptide sequence comparison typically include the following: 
Algorithm: Needleman and Wunsch. J. Mol. Biol. 48:443-453 (1970) 
Comparison matrix: BLOSSUM62 from Hentikoff and Hentikoff. Proc. Natl Acad. 
Sci 89:10915-10919 (1992) 

Gap Penalty: 12 , 
Gap Length Penalty: 4 

A program which can be used with these parameters is publicly available as the "gap- 
program from Genetics Computer Group. Madison Wisconsin. The above parameters along 
with no penalty for end gap are the default parameters for peptide comparisons. 

Parameters for polynucleotide sequence comparison include the following^ 
Algorithm: Needleman and Wunsch. J. Mol. Biol. 48:443-453 (1970) 
Comparison matrix: matches = +10; mismatches = 0 
Gap Penalty: 50 
Gap Length Penalty: 3 

A program which can be used with these parameters is publicly available as the "gap- 
program from Genetics Computer Group, Madison Wisconsin. The above parameters are the 

default parameters for nucleic acid comparisons. 

The invention also includes polypeptides of the formula: 

X-(Rl)n-(R2)-(R3)n-Y 

wherein, at the amino terminus, X is hydrogen, and at the carboxyl terminus. Y is hydrogen or 
a metal. R, and R3 are any amino acid residue, n is an integer between 1 and 1000. and R^ ts 
an amino acid sequence of the invention, particulariy an amino acid sequence selected from 
the group set forth in the Sequence Listing and preferably SEQ IDNOs: 2. 4, 6, 8, 1 1. 13, 15, 
17, 19, 21. 23, and 218-225. In the formula, R2 is oriented so that its amino terminal residue 
5 is It the left, bound to R., and its carboxy terminal residue is at the right, bound to R3. Any 
stretch of amino acid residues denoted by either R group, where R is greater than 1 . may be 
either a heteropolymer or a homopolymer, preferably a heteropolymer. 

Polypeptides of the present invention include isolated polypeptides encoded by a 
polynucleotide comprising a sequence selected from the group of a sequence contained in 
0 SEQ ID NOs: 1, 3, 5, 7, 9, 10, 12, 14. 16, 18, 20. 22. and 226-233. 

The polypeptides of the present invention can be mature protein or can be part of a 

fusion protein. 
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Fragments and variants of the polypeptides are also considered to be a part of the 
invention. A fragment is a variant polypeptide which has an amino acid sequence that is 
entirely the same as part but not all of the amino acid sequence of the previously described 
polypeptides. The fragments can be "free-standing" or comprised within a larger polypeptide 
of which the fragment forms a part or a region, most preferably as a single continuous region. 
Preferred fragments are biologically active fragments which are those fragments that mediate 
activities of the polypeptides of the invention, including those with similar activity or 
improved activity or with a decreased activity. Also included are those fragments that 
antigenic or immunogenic in an animal, particularly a human. 

Variants of the polypeptide also include polypeptides that vary from the sequences set 
forth in the Sequence Listing by conservative amino acid substitutions, substitution of a 
residue by another with like characteristics. In general, such substitutions are among Ala. 
Val, Leu and He; between Ser and Thr; between Asp and Glu; between Asn and Gin; between 
Lys and Arg; or between Phe and Tyr. Particularly preferred are variants in which 5 to 10; 1 
to 5; 1 to 3 or one amino acid(s) are substituted, deleted, or added, in any combination. 

Variants that are fragments of the polypeptides of the invention can be used to 
produce the corresponding full length polypeptide by peptide synthesis. Therefore, these 
variants can be used as intermediates for producing the full-length polypeptides of the 
invention. 

The polynucleotides and polypeptides of the invention can be used, for example, in 
the transformation of various host cells, as further discussed herein. 

The invention also provides polynucleotides that encode a polypeptide that is a mature 
protein plus additional amino or carboxyl-tenninal amino acids, or amino acids within the 
mature polypeptide (for example, when tiie mature fonn of the protein has more than one 
polypeptide chain). Such sequences can, for example, play a role in the processing of a 
protein from a precursor to a mature fonn. allow protein transport, shorten or lengthen protein 
half-life, or facilitate manipulation of the protein in assays or production. It is contemplated 
that cellular enzymes can be used to remove any additional amino acids from tiie mature 
protein. 

A precursor protein, having the mature fonn of the polypeptide fused to one or more 
prosequences may be an inactive form of the polypeptide. The inactive precursors generally 
are activated when the prosequences are removed. Some or all of the prosequences may be 
removed prior to activation. Such precursor protein are generally called proproteins. 
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The polynucleotide and polypeptide sequences can also be used to identify additional 
sequences which are homologous to the sequences of the present invention. The most 
preferable and convenient method is to store the sequence in a computer readable medium, 
for example, floppy disk, CD ROM, hard disk drives, external disk drives and DVD. and then 
to use the stored sequence to search a sequence database with well known searching tools. 
Examples of public databases include the DNA Database of Japan 
(DDBJ)(http://www.ddbj.nig.ac.jp/); Genebank 

nnhi nlm rf>- ^^^>/.»^w/n.nHnnW/lndex.htmlV, and the European Molecular 

Biology Laboratory Nucleic Acid Sequence Database (EMBL) 

....p.//^^w.hi.ac.uk/.>K^ H..</.n,hl db.html). A number of different search algorithms are 
available to the skilled artisan, one example of which are the suite of programs referred to as 
BLAST programs. There are five implementations of BLAST, three designed for nucleotide 
sequences queries (BLASTN, BLASTX. and TBLASTX) and two designed for protein 
sequence queries (BLASTP and TBLASTN) (CoMlson, Trends in Biotechnology. 12: 76-80 
(1994); Birren, et aL Genome Analysis. 1: 543-559 (1997)). Additional programs are 
available in the art for the analysis of identified sequences, such as sequence alignment 
programs, programs for the identification of more distantly related sequences, and the like, 
and are well known to the skilled artisan. 

Plant Constructs and Methods of Use 



or 



Of interest in the present invention, is the use of the nucleotide sequences, 
polynucleotides, in recombinant DNA constructs to direct the transcription or transcription 
and translation (expression) of the acyltransferase sequences of the present invention in a host 
cell. 

Of particular interest is the use of the nucleotide sequences, or polynucleotides, m 
recombinant DNA constructs to direct the transcription or transcription and translation 
(expression) of the acyltransferase sequences of the present invention in a host cell. The 
expression constructs generally comprise a promoter functional in a host cell operably linked 
to a nucleic acid sequence encoding an acyltransferase of the present invention and a 
transcriptional termination region functional in a host cell. 

By "host cell" is meant a cell which contains a vector and supports the replicaUon. 
and/or transcription or transcription and translation (expression) of the expression construct. 
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Host cells for use in the present invention can be prokaryotic cells, such as£. colt, or 
eukaryotic cells such as yeast, plant, insect, amphibian, or mammalian cells. Preferably, host 
cells are monocotyledenous or dicotyledenous plant cells. 

Of particular interest in the present invention is the use of the polynucleotides of the 
present invention for the preparation of constructs to direct the transcription or transcription 
and translation of the nucleotide sequences encoding an acyltransferase in a host plant cell. 
Plant expression constructs generally comprise a promoter functional in a plant host cell 
operably linked to a nucleic acid sequence of the present and a transcriptional termination 
region functional in a host plant cell. 

Those skilled in the art will recognize that there are a number of promoters which are 
functional in plant cells, and have been described in the literature. Chloroplast and plastid 
specific promoters, chloroplast or plastid functional promoters, and chloroplast or plastid 
operable promoters are also envisioned. 

One set of promoters are constitutive promoters such as the CaMV35S or FMV35S 
promoters that yield high levels of expression in most plant organs. Enhanced or duplicated 
versions of the CaMV35S and FMV35S promoters are useful in the practice of this invention 
(Odell. et al. (1985) Nature 313:810-812; Rogers. U.S. Patent Number 5,378, 619). In 
addition, it may also be preferred to bring about expression of the protein of interest in 
specific tissues of the plant, such as leaf, stem, root, tuber, seed, fruit, etc., and the promoter 
2 0 chosen should have the desired tissue and developmental specificity. 

Of particular interest is the expression of the nucleic acid sequences of the present 
invention from transcription initiation regions which are preferentially expressed in a plant 
seed tissue. Examples of such seed preferential transcription initiation sequences include 
those sequences derived from sequences encoding plant storage protein genes or from genes 
2 5 involved in fatty acid biosynthesis in oilseeds. Examples of such promoters include the 5' 
regulatory regions from such genes as napin (Kridl et al.. Seed Sc. Res. 7:209:219 (1991)). 
phaseolin, zein, soybean trypsin inhibitor, ACP. stearoyl-ACP desaturase. soybean a' subunit 
of P-conglycinin (soy 7s. (Chen et al, Proc. Natl. Acad. Sci., 83:8560-8564 (1986))) and 
oleosin. 

It may be advantageous to direct the localization of proteins conferring acyltransferase 
to a particular subcellular compartment, for example, to the mitochondrion, endoplasmic 
reticulum, vacuoles, chloroplast or other plastidic compartment. For example, where the 
genes of interest of the present invention will be targeted to plastids, such as chloroplasts, for 
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expression, the constructs will also employ the use of sequences to direct the gene to the 
plastid. Such sequences are referred to herein as chloroplast transit peptides (CTP) or plastid 
transit peptides (PTP). In this manner, where the gene of interest is not directly inserted into 
the plastid, the expression construct will additionally contain a gene encoding a transit 
peptide to direct the gene of interest to the plastid. The chloroplast transit peptides may be 
derived from the gene of interest, or may be derived from a heterologous sequence having a 
CTP. Such transit peptides are known in the art. See, for example. Von Heijne et a/. (1991) 
Plant Mol. Biol. Rep. 9:104-126; Clark et al. (1989) J. Biol. Chem. 264:17544-17550; della- 
Cioppa et al. (1987) Plant Physiol. 84:965-968; Romer et al. (1993) Biochem. Biophys. Res 
Commun. 796:1414-1421; and. Shah et al. (1986) Science 255:478-481. Additional transit 
peptides for the translocation of the protein to the endoplasmic reticulum (ER). or vacuole 
may also find use in the constructs of the present invention. 

Depending upon the intended use, the constructs may contain the nucleic acid 
sequence which encodes the entire acyltransferase protein, or a portion thereof. For example, 
where antisense inhibition of a given acyltransferase protein is desired, the entire sequence is 
not required. Furthermore, where acyltransferase sequences used in constructs are intended 
for use as probes, it may be advantageous to prepare constructs containing only a particular 
portion of a acyltransferase encoding sequence, for example a sequence which is discovered 
to encode a highly conserved acyltransferase region. 

The skilled artisan will recognize that there are various methods for the inhibition of 
expression of endogenous sequences in a host cell. Such methods include, but are not limited 
to antisense suppression (Smith, et al (1988) Nature 334:724-726) , co-suppression (Napoli, 
etal. 09S9) Plant Cell 2:219-2S9), ribozymes (PCT Publication WO 97/10328), and 
combinations of sense and antisense. such as those described by Waterhouse, et al. (1998) 
Proc. Natl. Acad. Sci. USA 95: 1 3959- 1 3964. Methods for the suppression of endogenous 
sequences in a host cell typically employ the transcription or transcription and translation of 
at least a portion of the sequence to be suppressed. Such sequences may be homologous to 
coding as well as non-coding regions of the endogenous sequence. 

Regulatory transcript termination regions may be provided in plant expression 
constructs of this invention as well. Transcript termination regions may be provided by the 
DNA sequence encoding the acyltransferase or a convenient transcription termination region 
derived from a different gene source, for example, the transcript termination region which is 
naturally associated with the transcript initiation region. The skilled artisan will recognize 
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that any convenient transcript termination region which is capable of terminating transcription 
in a plant cell may be employed in the constructs of the present invention. 

Alternatively, constructs may be prepared to direct the expression of the 
acyltransferase sequences directly from the host plant cell plastid. Such constructs and 
methods are known in the art and are generally described, for example, in Svab, et al. (1990) 
Proc. Natl. Acad. ScL USA 87:8526-8530 and Svab and Maliga (1993) Proc. Natl. Acad. Sci. 
USA 90:913-917 and in U.S. Patent Number 5,693,507. 

A plant cell, tissue, organ, or plant into which the recombinant DNA constructs 
containing the expression constructs have been introduced is considered transformed, 
transfected. or transgenic. A transgenic or transformed cell or plant also includes progeny of 
the cell or plant and progeny produced from a breeding program employing such a transgenic 
plant as a jiarent in a cross and exhibiting an altered genotype resulting from the presence of 
an introduced acyltransferase nucleic acid sequence. 

The term "introduced" in the context of inserting a nucleic acid sequence into a cell, 
means "transfection", or "transformation" or "transducuon" and includes reference to the 
incorporation of a nucleic acid sequence into a eukaryotic or prokaryotic cell where the 
nucleic acid sequence may be incorporated into the genome of the cell (for example, 
chromosome, plasmid, plastid, or mitochondrial DNA), converted into an autonomous 
replicon, or transiently expressed (for example, transfected mRNA). 

Plant expression or transcription constructs having an acyltransferase as the DNA 
sequence of interest for increased or decreased expression thereof may be employed with a 
wide variety of plant life, particularly, plant life involved in the production of vegetable oils 
for edible and industrial uses. Plants of interest in the present invention include 
monocotyledenous and dicotyledenous plants. Most especially preferred are temperate 
oilseed crops. Plants of interest include, but are not limited to, rapeseed (Canola and High 
Erucic Acid varieties), sunflower. safHower, cotton, soybean, peanut, coconut and oil palms, 
and com. Depending on the method for introducing the recombinant constructs into the host 
cell, other DNA sequences may be required. ImportanUy, this invention is applicable to 
dicotyledyons and monocotyledons species alike and will be readily applicable to new and/or 
improved transformation and regulation techniques. 

As used herein, the term "plant" includes reference to whole plants, plant organs (for 
example, leaves, stems, roots, etc.), seeds, and plant cells and progeny of same. Plant cell, as 
used herein includes, without limitation, seeds suspension cultures, embryos, meristematic 
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regions, callus tissue, leaves roots shoots, gametophytes, sporophytes, pollen, and 
microspores. The class of plants which can be used in the methods of the present invention is 
generally as broad as the class of higher plants amenable to transformation techniques, 
including both monocotyledenous and dicotyledenous plants. Particularly preferred plants of 
interest include, but are not limited to, rapeseed (Canola and High Erucic Acid varieties), 
sunflower, safflower, couon, soybean, peanut, coconut and oil palms, and com. Most 
especially preferred plants include Brassica, soybean, and corn. 

As used herein, "transgenic plant" includes reference to a plant which comprises 
within its genome a heterologous polynucleotide. Generally, the heterologous polynucleotide 
is stably integrated within the genome such that the polynucleotide is passed on to successive 
generations. The heterologous polynucleotide may be integrated into the genome alone or as 
part of a recombinant expression cassette. 'Transgenic" is used herein to include any cell, 
cell line, callus, tissue, plant pan or plant, the genotype of which has been altered by the 
presence of heterologous nucleic acid including those transgenics initially so altered as well 
as those created by sexual crosses or asexual propagation from the initial transgenic. 

Thus a plant having within its cells a heterologous polynucleotide is referred to herein 
as a transgenic plant. The heterologous polynucleotide can be either stably integrated into the 
genome, or can be extra-chromosomal. Preferably, the polynucleotide of the present 
invention is stably integrated into the genome such that the polynucleotide is passed on to 
successive generations. The polynucleotide is integrated into the genome alone or as part of a 
recombinant expression cassette. "Transgenic" is used herein to include any cell, cell line, 
callus, tissue, plant part or plant, the genotype of which has been altered by the presence of 
heterologous nucleic acids including those transgenics initially so altered as well as those 
created by sexual crosses or asexual reproduction of the initial transgenics. 

As used herein, "heterologous" in reference to a nucleic acid is a nucleic acid that 
originates from a foreign species, or, if from the same species, is substantially modified from 
its native form in composition and/or genomic locus by deliberate human intervention. For 
example, a promoter operably linked to a heterologous structural gene is from a species 
different from that from which the structural gene was derived, or, if from the same species, 
one or both are substantially modified from their original form. A heterologous protein may 
originate from a foreign species, or, if from the same species, is substantially modified from 
its original form by deliberate human intervention. 
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As used herein, a "recombinant expression cassette" is a nucleic acid construct, 
generated recombinantly or synthetically, with a series of specified nucleic acid elements 
which permit transcription of a particular nucleic acid in a target cell. The recombinant 
expression cassette can be incorporated into a plasmid, chromosome, mitochondrial DNA, 
5 plastid DNA, virus, or nucleic acid fragment. Typically, the recombinant expression cassette 
portion of an expression vector includes, among other sequences, a nucleic acid sequence to 
be transcribed and a promoter. 
! It is contemplated that the gene sequences may be synthesized, either completely or in 

\ part, especially where it is desirable to provide plant-preferred sequences. Thus, all or a 
10 portion of the desired structural gene (that portion of the gene which encodes the 

acyltransferase protein) may be synthesized using codons preferred by a selected host. Host- 
preferred codons may be determined, for example, from the codons used most frequently in 
the proteins expressed in a desired host species. 

One skilled in the art will readily recognize that antibody preparations, nucleic acid 
15 probes (DNA and RNA) and the like may be prepared and used to screen and recover 

"homologous" or "related" acyltransferase from a variety of plant sources. Homologous 
sequences are found when there is an identity of sequence, which may be determined upon 
comparison of sequence information, nucleic acid or amino acid, or through hybridization 
reactions between a known acyltransferase and a candidate source. Conservative changes, 

2 0 such as Glu/Asp, Val/Ile, Ser/Thr, Arg/Lys and Gln/Asn may also be considered in 

determining sequence homology. Amino acid sequences are considered homologous by as 
little as 25% sequence identity between the two complete mature proteins. (See generally, 
Doolittle, R.F., OF URFS and ORES (University Science Books, CA, 1986.) 

Thus, other acyltransferase sequences can be obtained from the specific exemplified 
25 sequences provided herein. Furthermore, it will be apparent that one can obtain naturzd and 
synthetic sequences, including modified amino acid sequences and starting materials for 
synthetic-protein modeling from the exemplified sequences and from acyltransferases which 
are obtained through the use of such exemplified sequences. Modified amino acid sequences 
include sequences which have been mutated, truncated, increased and the like, whether such 

3 0 sequences were partially or wholly synthesized. Sequences which are actually purified from 

plant preparations or are identical or encode identical proteins thereto, regardless of the 
method used to obtain the protein or sequence, are equally considered naturally derived. 



wo 00/18889 PCTAJS99/22231 

For immunological screening, antibodies to the acyltransf erase protein can be 
prepared by injecting rabbits or mice with the purified protein or portion thereof, such 
methods of preparing antibodies being well known to those" in the art. Either monoclonal or 
polyclonal antibodies can be produced, although typically polyclonal antibodies are more 
5 useful for gene isolation. Western analysis may be conducted to determine that a related 
protein is present in a crude extract of the desired plant species, as determined by cross- 
reaction with the antibodies to the acyltransferase protein. When cross-reactivity is observed, 
genes encoding the related proteins are isolated by screening expression libraries representing 
the desired plant species. Expression libraries can be constructed in a variety of commercially 

10 available vectors, including lambda gtl 1 , as described in Sambrook, et aL {Molecular 

Cloning: A Laboratory Manual, Second Edition (1989) Cold Spring Harbor Laboratory, Cold 
Spring Harbor, New York). 

The nucleic acid sequences associated with acyltransferase proteins will find many 
uses. For example, recombinant constructs can be prepared which can be used as probes, or 

15 which will provide for expression of the acyltransferase protein in host cells to produce a 

ready source of the enzyme and/or to modify the composition of triglycerides found therein. 
Other useful applicatioi^s may be found when the host cell is a plant host cell, either in vitro 
or in vivo. 

The modification of fatty acid compositions may also affect the fluidity of plant 
20 membranes. Different lipid concentrations have been observed in cold-hardened plants, for 
example. By this invention, one may be capable of introducing traits which will lend to chill 
tolerance. Constitutive or temperature inducible transcription initiation regulatory control 
regions may have special applications for such uses. 

As discussed above, nucleic acid sequence encoding an acyltransferase of this 
25 invention may include genomic, cDNA or mRNA sequence. By "encoding" is meant that the 
sequence corresponds to a particular amino acid sequence either in a sense or anti-sense 
orientation. By "extrachromosomal". is meant that the sequence is outside of the plant 
genome of which it is naturally associated. By "recombinant" is meant that the sequence 
contains a genetically engineered modification through manipulation via mutagenesis, 
3 0 restriction enzymes, and the like. 

Once the desired acyltransferase nucleic acid sequence is obtained, it may be 
manipulated in a variety of ways. Where the sequence involves non-coding flanking regions, 
the flanking regions may be subjected to resection, mutagenesis, etc. Thus, transitions. 
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transversions, deletions, and insertions may be performed on the naturally occurring 
sequence. In addition, all or part of the sequence may be synthesized. In the structural gene, 
one or more codons may be modified to provide for a modified amino acid sequence, or one 
or more codon mutations may be introduced to provide for a convenient restriction site or 
5 other purpose involved with construction or expression. The structural gene may be further 
modified by employing synthetic adapters, linkers to introduce one or more convenient 
restriction sites, or the like. 
I The nucleic acid or amino acid sequences encoding an acyltransferase of this 

\ invention may be combined with other non-native, or "heterologous", sequences in a variety 
10 of ways. By "heterologous" sequences is meant any sequence which is not naturally found 
joined to the acyltransferase, including, for example, combinations of nucleic acid sequences 
from the same plant which are not naturally found joined together. 

The DNA sequence encoding an acyltransferase of this invention may be employed in 
conjunction with all or part of the gene sequences normally associated with the 
15 acyltransferase. In its component parts, a DNA sequence encoding acyltransferase is 

combined in a DNA construct having, in the 5* to 3' direction of transcription, a transcription 
initiation control region capable of promoting transcription and translation in a host cell, the 
DNA sequence encoding plant acyltransferase and a transcription and translation termination 
region. 

2 0 Potential host cells include both prokaryotic cells, such as E.coli and eukaryotic cells 

such as yeast, insect, amphibian, or mammalian cells. A host cell may be unicellular or found 
in a multicellular differentiated or undifferentiated organism depending upon the intended 
use. Preferably, host cells of the present invention include plant cells, both 
monocotyledenous and dicotyledenous. Cells of this invention may be distinguished by 

25 having a sequence foreign to the wild-type cell present therein, for example, by having a 
recombinant nucleic acid construct encoding an acyltransferase therein. 

The methods used for the transformation of the host plant cell are not critical to the 
present invention. The transformation of the plant is preferably permanent, i.e. by integration 
of the introduced expression constructs into the host plant genome, so that the introduced 

30 constructs are passed onto successive plant generations. The skilled artisan will recognize 
that a wide variety of transformation techniques exist in the art, and new techniques are 
continually becoming available. Any technique that is suitable for the target host plant can be 
employed within the scope of the present invention. For example, the constructs can be 
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introduced in a variety of forms including, but not limited to as a strand of DNA, in a 
plasmid, or in an artifictail chromosome. The introduction of the constructs into the target 
plant cells can be accomplished by a variety of techniques, including, but not limited to 
calcium-phosphate-DN A co-precipitation, electroporation, microinjection, Agrobacterium 
5 infection, liposomes or microprojectile transformation. The skilled artisan can refer to the 
literature for details and select suitable techniques for use in the methods of the present 
invention. 

I Normally, included with the DNA construct will be a structural gene having the 

\ necessary regulatory regions for expression in a host and providing for selection of 

10 transformant cells. The gene may provide for resistance to a cytotoxic agent, e.g. antibiotic, 
heavy metal, toxin, etc., complementation providing prototrophy to an auxotrophic host, viral 
immunity or the like. Depending upon the number of different host species the expression 
construct or components thereof are introduced, one or more markers may be employed, 
where different conditions for selection are used for the different hosts. 

15 Where Agrobacterium is used for plant cell transformation, a vector may be used 

which may be introduced into the Agrobacterium host for homologous recombination with T- 
DNA or the Ti- or Ri-plasmid present in the Agrobacterium host. The Ti- or Ri-plasmid 
containing the T-DNA for recombination may be armed (capable of causing gall formation) 
or disarmed (incapable of causing gall formation), the latter being permissible, so long as the 

20 vir genes are present in the Xrmsioxmcd Agrobacterium host. The armed plasmid can give a 

mixture of normal plant cells and gall. 
^ In some instances where Agrobacterium is used as the vehicle for transforming host 

plant cells, the expression or transcription construct bordered by the T-DNA border region(s) 
will be inserted into a broad host range vector capable of replication in E. coli and 

25 Agrobacterium, there being broad host range vectors described in the literature. Commonly 
used is pRK2 or derivatives thereof. See, for example, Ditta, et aL, (Proc. Nat. Acad, Sci., 
U.S.A. (1980) 77:7347-7351) and EPA 0 120 515, which are incorporated herein by reference. 
Alternatively, one may insert the sequences to be expressed in plant cells into a vector 
containing separate replication sequences, one of which stabilizes the vector \nE. coli, and 

3 0 the other in Agrobacterium. See, for example, McBride and Summerfelt {Plant Mol. BioL 
(1990) 74:269-276), wherein the pRiHRI (Jouanin, et aL Mol. Gen. Genet. (1985) 201:370- 
374) origin of replication is utilized and provides for added stability of the plant expression 
vectors in host Agrobacterium cells. 
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Included with the expression construct and the T-DNA will be one or more markers, 
which allow for selection of transformed Agrobacterium and transformed plant cells. A 
number of markers have been developed for use with plant cells, such as resistance to 
chloramphenicol, kanamycin, the aminoglycoside G418, hygromycin, or the like. The 
5 particular marker employed is not essential to this invention, one or another marker being 
preferred depending on the particular host and the manner of construction. 

For transformation of plant cells using Agrobacterium, explants may be combined and 
incubated with the transformed Agrobacterium for sufficient time for transformation, the 
bacteria killed, and the plant cells cultured in an appropriate selective medium. Once callus 

10 forms, shoot formation can be encouraged by employing the appropriate plant hormones in 
accordance with known methods and the shoots transferred to rooting medium for 
regeneration of plants. The plants may then be grown to seed and the seed used to establish 
repetitive generations and for isolation of vegetable oils. 

There are several possible ways to obtain the plant cells of this invention which 

15 contain multiple expression constructs. Any means for producing a plant comprising a 
construct having a nucleic acid sequence of the present invention, and at least one other 
construct having another DNA sequence encoding an enzyme are encompassed by the present 
invention. For example, the expression construct can be used to transform a plant at the same 
time as the second construct either by inclusion of both expression constructs in a single 

20 transformation vector or by using separate vectors, each of which express desired genes. The 
second construct can be introduced into a plant which has already been transformed with the 
first expression construct, or alternatively, transformed plants, one having the first construct 
and one having the second construct, can be crossed to bring the constructs together in the 
same plant. 

25 In general, acyltransferase proteins are active in the transfer of acyl groups from a 

donor to a variety of different substrates. For example, diacylglycerol acyltransferases add 
acyl groups to diacylglycerol to form triacylglycerol (TAG), oracyl:CoA:cholesterol 
acyltransferase uses an acyl-CoA as a donor to transfer an acyl group to a sterol to form a 
sterol ester. Typically, the substrates include, but are not limited to glycerides, including 

3 0 mono and diglycerides, sterols, stanols, phosphatides, and the like. Donors include, but are 
not limited to acyl-CoA and acyl-ACP molecules. 
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The invention now being generally described, it will be more readily understood by 
reference to the following examples which are included for purposes of illustration only and 
are not intended to limit the present invention. 

5 

EXAMPLES 

Example 1: RNA Isolations 

10 Total RNA from the inflorescence and developing seeds of Arabidopsis thaliana is 

isolated for use in construction of complementary (cDNA) libraries. The procedure is an 
adaptation of the DNA isolation protocol of Webb and Knapp (D.M. Webb and SJ. Knapp, 
(1990) Plant Molec. Reporter, 8, 180-185). The following description assumes the use of Ig 
fresh weight of tissue. Frozen seed tissue is powdered by grinding under liquid nitrogen. The 

15 powder is added to 10ml REC buffer (50mM Tris-HCl, pH 9, 0.8M NaCl, lOmM EDTA, . 
0.5% w/v CTAB (cetyltrimethyl-ammonium bromide)) along with 0.2g insoluble 
polyvinylpolypyrrolidone, and ground at room temperature. The homogenate is centrifuged 
for 5 minutes at 12,000 xg to pellet insoluble material. The resulting supernatant fraction is 
extracted with chloroform, and the top phase is recovered. 

20 The RNA is then precipitated by addition of 1 volume RecP (50mM Tris-HCL pH9, 

lOmM EDTA and 0.5% (w/v) CTAB) and collected by brief centrifugation as before. The 
RNA pellet is redissolved in 0.4 ml of IM NaCl. The RNA pellet is redissolved in water and 
extracted with phenol/chloroform. Sufficient 3M potassium acetate (pH 5) is added to make 
the mixture 0.3M in acetate, followed by addition of two volumes of ethanol to precipitate the 

25 RNA. After washing with ethanol, this final RNA precipitate is dissolved in water and stored 
frozen. 

Alternatively, total RNA may be obtained using TRlzol reagent (BRL- 
Lifetechnologies, Gaithersburg, MD) following the manufacturers protocol. The RNA 
precipitate is dissolved in water and stored frozen. 

30 
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Searches are performed on a Silicon Graphics Unix computer using additional 
Bioaccellerator hardware and GenWeb software supplied by Compugen Ltd. This software 
and hardware enables the use of the Smith-Waterman algorithm in searching DNA and 
5 protein databases using profiles as queries. The program used to query protein databases is 
profilesearch. This is a search where the query is not a single sequence but a profile based on 
a multiple alignment of amino acid or nucleic acid sequences. The profile is used to query a 
\ sequence data set, i.e., a sequence database. The profile contains all the pertinent information 
\ for scoring each position in a sequence, in effect replacing the "scoring matrix" used for the 
1 0 standard query searches. The program used to query nucleotide databases with a protein 

profile is tprofilesearch. Tprofilesearch searches nucleic acid databases using an amino acid 
profile query. As the search is running, sequences in the database are translated to amino acid 
sequences in six reading frames. The output file for tprofilesearch is identical to the output 
file for profilesearch except for an additional column that indicates the frame in which the 
15 best alignment occurred. 

The Smith-Waterman algorithm, (Smith and Waterman (1981) supra), is used to 
search for similarities between one sequence from the query and a group of sequences 
contained in the database. E score values as well as other sequence information, such as 
conserved peptide sequences of HXXXXD and PEG are used to identify related sequences. 
2 0 By using the conserved peptide sequence information, E score values of greater than E-12 and 
E-8 are considered. For example, the EST sequence originally used to identify ATAT2 had 
an E score of 0.0094, while the EST sequence originally used to identify ATLPAATl had an 
E score of 0.0868. 

A protein sequence of glycerol-3-phosphate fromf. coli (Swiss Prot Accession 

2 5 P0(M82) is used to search the NCBI non-redundant protein database using BLAST. In the 

first round of searches, other membrane forms of G3PAAT are identified. In subsequent PSI- 
BLAST searches (Altschul, er al. (1997) Nucleic Acids Res 25:3389-3402), LPAATs and 
other acyltransferases are identified. Using sequence alignment software programs, G3PAAT 
and different LPAAT amino acid sequences are aligned, and a profile is generated using a 

3 0 homologous sequence region, between amino acids 256 and 459 of the E. coli sequence. 

The identified 204 amino acid is used to query the protein database using PSI-BLAST. 
After 5 iterations of PSI-BLAST, the profile generated from this new query (Figure 1) 
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identified soluble forms of G3PAAT. Prior to this identification, no sequence homology had 
been identified between the membrane and soluble forms of G3PAAT. 

5 Examples: Excision of PSI-BLAST Profile 

The profile generated from the queries using PSI-BLAST is excised from the hyper 
• text markup language (html) file. The worldwide web (www)/html interface to psiblast at 
\ ncbi stores the current generated profile matrix in a hidden field in the html file that is 
10 returned after each iteration of psiblast. However, this matrix has been encoded into string62 
(s62) format for ease of transport through html. String62 format is a simple conversion of the 
values of the matrix into html legal ascii characters. 

The encoded matrix width (x axis) is 26 characters, and comprise the consensus 
characters, the probabilities of each amino acid in the order A,B,C,D,E,F,G,H,I,K,L,M,N, 
15 P,Q,R,S,T,V,W,X,Y,Z (where B represents D and N, and Z represents Q and E, and X 
represents any amino acid), gap creation value, and gap extension value. 

The length (y axis) of the matrix corresponds to the length of the sequences identified 
by PSI-BLAST. The order of the amino acids corresponds to the conserved amino acid 
sequence of the sequences identified using PSI-BLAST, with the N-terminal end at the top of 
20 the matrix. The probabilities of other amino acids at that position are represented for each 
amino acid along the x axis, below the respective single letter amino acid abbreviation. 

Thus, each row of the profile consists of the highest scoring (consensus) amino acid, 
followed by the scores for each possible amino acid at that position in sequence matrix, the 
score for opening a gap that that position, and the score for continuing a gap at that position. 
25 The string62 file is converted back into a profile for use in subsequent searches. The 

gap open field is sel to 1 1 and the gap extension field is set to 1 along the x axis. The gap 
creation and gap extension values are known, based on the settings given to the PSI-BLAST 
algorithm. The matrix is exported to the standard GCG profile form. This format can be read 
by GenWeb. 

3 0 The algorithm used to convert the string62 formatted file to the matrix is outlined in 

Table 1. 
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Table 1 



10 



1. if encoded character z then the value is blast score min 

2. if encoded character Z then the value is blast score max 

3. else if the encoded character is uppercase then its value is (64-(ascii # of char)) 

4. else if the encoded character is a digit the value is ((ascii # of char)-48) 

5. else if the encoded character is not uppercase then the value is ((ascii # of char) - 87) 

6. ALL B positions are set to min of D and N amino acids at that row in sequence matrix 

7. ALL Z positions are set to min of Q amd E amino acids at that row in sequence matrix 

8. ALL X positions are set to min of all amino acids at that row in sequence matrix 

9. kBLAST_SCORE_MAX=999; 

10. kBLAST_SCORE.MIN=-999; 

11. all gap opens are set to 1 1 

12. all gap lens are set to 1 
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Example 4: Identification of Novel Acyltransferase Related Amino Acid Sequences 

20 The profile (Figure 1) is used in further queries to identify a number of previously 

unidentified proteins from yeast as novel acyltransferases. A protein is identified from an 
Arabidopsis protein sequence database (ATATl) (SEQ ID NO;2). Sequences are also 
identified from nucleic acid databases (Table 2) 



25 



Table 2 



Database ID Number 


BLAST Search Hits 


Log probability 


Saccharomvces cerevisiae 






gi 1078509 


Limnanthes putative LPAAT 


e- 10 (SEQ ID 


NO:217) 






gi 586485 


Limnanthes putative LPAAT 


e- 13 (SEQ ID 


NO:218) 
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Limnanthes putative LPAAT 


e- 19 (SEQ ID 


NO:219) 








SUPPRESSES CTRl (choline transport mutant) (SEQ ID NO:220) 


gl 549627 


similar to CTRl 


e-118(SEQID 








gl 2133031 


unidentified 


(SEQ ID 


JNU.222) 






gi 2132939 


unidentified 


(SEQ ID 


NO:223) 






gi 2132299 


TAFAZZIN 


e-14 (SEQ ID 


NO:224) 







In Table 2, the gi number is the database identifier, the middle column shows the 
results of BLAST searches against the NCBI NR protein database, and the log probability 
1 5 number shows represents the log of the probability of such a match occurring by random 
chance. These proteins, including the AT ATI protein sequence, are identified using the 
original PSI-BLAST search of the NCBI NR protein database. Thus, these proteins are novel 
acyltransferase related proteins with unidentified activities. 

The Arabidopsis acyltransferase sequence, herein referred to as ATATl, is also 
2 0 identified using the original PSI-BLAST search of the NCBI NR protein database, and did not 
I have an annotated function. 

Additional Arabidopsis amino acid sequences related to acyltransferases are identified 
from the databases, referred to as ATAT2est, ATAT3est, ATAT4est, ATATSest, ATAT6est, 
ATAT7est, ATATSest, ATAT9, ATATIO, and ATATl lest. Yunhcvmorc, Arabidopsis 
25 amino acid sequences are identified which demonstrate sequence similarity to known 

lysophosphatidic acid, referred to as ATLPAATl. The sequences of ATAT9 and ATATIO 
are identified from the database as genomic sequences, all other Arabidopsis sequences are 
identified as ESTs. 
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To obtain the entire coding region corresponding to IhG Arabidopsis acyltransferase 
sequences, synthetic oligo-nucleotide primers are designed to amplify the 5' and 3' ends of 
partial cDNA clones containing acyltransferase related sequences. Primers are designed 
according to the respective Arabidopsis acyltransferase related sequences (Table 3) and used 
in Rapid Amplification of cDNA Ends (RACE) reactions (Frohman et al, (1988) Proc. Natl. 
Acad, Set, 85:8998-9002) using the Marathon cDNA amplification kit (Clontech 
Laboratories Inc, Palo Alto, CA). Primers with an R designation are used for 5' RACE 
reactions, and primers with an F designation are used for 3' RACE reactions. 



r 
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Table 3 



10 



15 



20 



25 



30 



ATAT2 

ATAT2R1 CCATCCGCTTCAAGGGAACGACACCCATCA (SEQ ID NO:135) 

ATAT2R2 TCCCTGTCTTGCTTGATG AACTTAAAGCTTG (SEQ ID NO: 1 36) 

ATAT2R3 ACAGCAGGAGTGTCTGATGATGGCAGATTC (SEQ ID NO: 1 37) 



ATAT3 
ATAT3R1 



ACTGGAGTTCCAGCCAAAAATGCACCTGTC (SEQ ID NO: 138) 



ATAT3R2 GATACACCCTTGAAATCAGGCGATTTTGCT (SEQ ID NO: 139) 
ATAT4 

ATAT4R1 TTGCAAATTCAATTCCTGTTTCACCGGGCC (SEQ ID NO: 140) 

ATAT4R2 GTTTTCTGCTATTCCAGAAGGCGTCAACAA (SEQ ID NO: 141) 

ATAT5 

ATAT5R 1 CATTG A AG ATCCGTCCGTG AAGTTNCCTTACC (SEQ ID NO: 1 42) 

ATAT5R2 TCGAGCTGTGATCGATGATTGGCTGTGAAG (SEQ ID NO: 143) 

ATAT5F 1 GTCTCTTCA AA A ACAC ACACACACGTCTCT (SEQ ID NO: 1 44) 

ATAT5F2 GTCTCTTCA AAA ACAC ACACACACGTCTCT (SEQ ID NO: 1 45) 

ATAT6 

H76348-F 1 GTAG AG AGCCTTACTTGCTTCGGTTTAGTC (SEQ ID NO: 1 46) 

H76348-F2 ACGTCATCGTACCTGTTGCTATTGACTCAC (SEQ ID NO: 147) 

H76348-R1 ACTTTTCCATTGTCAGGGACTCCTCGACAC (SEQ ID NO: 148) 

H76348-R2 ACGGTGTAGGAAGGGAAAGGATTCAAAAGG (SEQ ID NO: 149) 



ATAT7 



ATTS0193-F1 GCGATpAACTACAGAGTCGGATTCTTCCTC (SEQ ID NO: 150) 
ATTS0193-F2 CCGGTTTACGAGATTACGTTCTTGAACCAG (SEQ ID NO: 1 5 1 ) 
ATTSOl 93-R 1 CAATGGAG ACAAGGCTCGAA AGTGCTAACC (SEQ ID NO: 1 52) 
ATTS0193-R2 ATTCTCTGAACATAGTTCGCCACGGTCATG (SEQ ID NO: 153) 
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ATAT8 

AA042618-F1 GAAATCCAACGCCTTCCCAATATCACTCTG (SEQ ID NO:154) 
AA042618-F2 CTTCAACTTTCCATCAGGATCTTGGCACGT (SEQ ID NO: 155) 
A A0426 1 8-R 1 ACC ACTTGTT AG AG ACCTTACCTGCTTAGG (SEQ ID NO: 1 56) 
AA042618-R2 TCCTACCTACACCATCCAATTTCTCGACCC (SEQ ID NO: 157) 

AT AT 11 

AT ATI 1 R 1 CTGCGTCA AGTGAGCAACTCAGTTCTTGCA (SEQ ID NO: 158) 

ATAT 1 1 R2 TGGG A AGC AGC ACGTTGTTC AGTATCGG A A (SEQ ID NO: 1 59) 

AT AT 1 1 R3 TAGCCTCTGTGT A ATCTGTGCCCTCGGGG A (SEQ ID NO: 1 60) 



From the nucleic acid sequences obtained from the RACE reactions, protein sequence 
is predicted for each nucleic acid sequence using Macvector software. Nucleic acid sequences 

15 are provided for ATATl (SEQ ID NO: 1), ATAT2 (SEQ ID NO:3), ATAT3 (SEQ ID NO:5), 
ATAT4 (SEQ ID NO:7), ATAT5 (SEQ ID NO:9), ATAT6 (SEQ ID NO: 10), ATAT7 (SEQ 
ID NO:12), ATAT8 (SEQ ID NO:14), ATAT9 (SEQ ID NO:16), ATATIO (SEQ ID NO:18). 
ATATl 1 (SEQ ID NO:20) and ATLPAATl (SEQ ID NO:22), respectively. 

The protein sequence derived from the ATATl (SEQ ID NO:2) nucleic acid sequence 

2 0 from Arabidopsis has a predicted molecular mass of 32.5 kDa, and a PI of 9.74. Alignment 
of the Arabidopsis acyltransferase with several LPAAT and G3PAAT shows that some of the 
domains that are conserved between LPAAT and G3PAAT are conserved in the new 
acyltransferase protein. 

The ATAT2 nucleic acid sequence is predicted to encode a 312 amino acid protein 

2 5 (SEQ ID NO:4), with a molecular weight of 34.6 kD, and a pi of 9.99. The ATAT2 protein 
may also contain 2 to 3 transmembrane domains. However, the protein encoded by the 
ATAT2 nucleic acid sequence may be longer than predicted because of the absence of an 
inframe slop codon upstream of the ATG start codon used. 

The ATAT3 nucleic acid sequence is predicted to encode a 398 amino acid protein 

3 0 (SEQ ID NO:6), with a molecular weight of 44.7 kD, and a pi of 5.62. The ATAT3 protein 
may contain 1 to 4 transmembrane domains. The ATAT4 nucleic acid sequence is predicted 
to encode a 317 amino acid protein (SEQ ID NO:8), with a molecular weight of 36.5 kD, and 
a pl of 9.67. The ATAT4 protein is predicted to have 2 to 5 transmembrane domains. 
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The ATLPAATl nucleic acid sequence is predicted to encode a 389 amino acid 
protein (SEQ ID NO:23), with a molecular weight of 43.7 kD, and a pi of 9.52. The 
ATLPAATl protein is predicted to have up to 3 transmembrane domains. The protein 
predicted from the ATLPAATl nucleic acid sequence is similar to LPAATs reported for 
5 Brassica, maize, and meadowfoam (described in PCT Publication WO 94/13814). The 
ATATl 1 nucleic acid sequence is predicted to encode a 375 amino acid protein (SEQ ID 
NO:21), with a molecular weight of 43.5 kD, and a pi of 9.45. The deduced amino acid 
\ sequences of ATAT6 (SEQ ID NO:l 1), ATAT7 (SEQ ID NO:13), ATAT8 (SEQ ID NO:15). 
\ ATAT9 (SEQ ID NO:17), and ATATIO (SEQ ID NO:19) are also provided 

10 A sequence region approximately 30 amino acids upstream through approximately 

100 amino acids downstream of the conserved amino acid sequences HXXXXD (Heath and 
Rock, (1998) J. BacterioL 180(6): 1425- 1430) and PEG (Neuwald (1997) Curr Biol 7:R465- 
R466) of the predicted amino acid sequences derived from the nucleic acid sequences of 
ATATl, ATAT2, ATAT3, ATAT4, ATAT6, ATAT7, ATAT8, ATAT9, ATATIO, 

1 5 ATLPAATl , and ATATl 1 are compared to the amino acid sequences of lysophosphatidic 
acid acyltransferase (Jojoba AT (SEQ ID NO: 162, the nucleic acid sequence is provided in 
SEQ ID NO:161), maize AT (PCT Publication WO 94/13814), PLSC coco(GenBank 
accession 1098605), PLSC Lim(GenBank accession 1209507), PLSCEcoli (GenBank 
accession 1209507), and PLSC Yeast(GenBank accession 464422)) and glycerol-3-phosphate 

2 0 acyltransferase (PLSB Ecoli(GenBank accession 130326) and PLSB Mouse(GenBank 
accession 2498786)) (Figure 2), and similarities are identified (Figure 2 and Figure 3). 

Sequence comparisons reveal several classes of acyltransferases exist based on 
conserved amino acid sequences identified in the comparisons in Figure 2. For example, ^ 
ATATl, ATAT6, ATAT7, ATAT8, and ATAT9, contain the conserved amino acid 

2 5 sequences of VTYSXS(SEQ ID NO: 128), VXLTRXR(SEQ ID NO: 129), LXXGDLV(SEQ 
ID NO: 132) between the HXXXXD and PEG sequences. In addition, ATATl, ATAT6, 
ATAT7, ATAT8, and ATAT9 also contain the conserved sequences CPEGT(SEQ ID NO: 
130) which comprises the PEG sequence, as well as IVPVA(SEQ ID NO: 131) and 
VANXXQ (SEQ ID NO: 134)(Figure 2) downstream of the PEG sequence. The sequences 

3 0 corresponding to ATATl , ATAT7, and ATAT9 are the most closely related in this class, with 
similarities between ATATl and ATAT9 of 67.0%, between ATATl and ATAT7 of 58.2% 
and between ATAT9 and ATAT7 of 63.9% (Figure 3B). 
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Sequence comparisons also demonstrate that the sequence of ATLPAATl is most 
closely related to the jojoba LPAAT (82.3% similar), and maize (78.0% similar). 

Furthermore, sequence analysis demonstrates that ATAT4 is the most divergent 
sequence with the highest similarity to ATATIO (18.5%). The highest similarity (153%) to a 
5 known sequence is with a meadowfoam (Limnanthes douglassi) LPAAT. However, the 

sequences of ATAT4 and ATATIO share several conserved peptide sequences with the amino 
acid sequences of ATAT2 and ATAT3 (Figure 2), VXNHXS (SEQ ID NO: 127) where the H 
comprises the conserved H of the HXXXXD sequence and FXXGAF (SEQ ID NO: 133) 
downstream of the PEG sequence. 

10 

Example 6: Identification of Additional Acyltransferase Sequences 

The novel Arabidopsis sequences identified above are used to search proprietary 

15 databases containing soybean and com EST sequences. The results of this search identifies 
EST sequences from soybean (SEQ ID NO:24 through SEQ ID NO: 85) as well as from com 
(SEQ ID NO: 86 through SEQ ID NO: 126) as encoding acyltransferase related proteins. 

Sequence comparisons between the various EST sequences and the complete 
Arabidopsis sequences reveals that the identified EST sequences demonstrate higher 

20 similarity to the various Arabidopsis sequences as determined by BLAST scores. 

Expressed Sequence Tag (EST) sequences from soybean and corn databases are 
identified which are most closely related by BLAST score to ATATl (SEQ ID NOS:24-29 
and SEQ ID NOS:86-88, respectively), ATAT2 (SEQ ID NO: 30 and SEQ ID NO:89, 
respectively), ATAT3 (SEQ ID NOS:31-35 and SEQ ID NOS:90-94, respectively), ATAT4 

25 (SEQ ID NOS:36-44 and SEQ ID NOS:95-100, respectively), ATAT6 (SEQ ID NOS:45-49 
and SEQ ID NO: 101, respectively), ATAT7 (SEQ ID NOS:50-54 and SEQ ID NOS:102-103, 
respectively), ATAT8 (SEQ ID NOS:55-56 and SEQ ID NO: 104, respectively), ATAT9 
(SEQ ID NOS:57-79 and SEQ ID NOS:105-1 1 1, respectively), ATATIO (SEQ ID NOS:80- 
81 and SEQ ID NO:l 12, respectively), ATATl 1, (SEQ ID NOS:82-85 and SEQ ID 

30 NOS:123-126, respectively), and ATLPAATl (SEQ ID NOS: 1 13-122 respectively). 
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Example 7: Expression Construct Preparation 

A series of synthetic oligo nucleotide primers were prepared for use in Polymerase 
Chain Reactions (PCR) to amplify the entire DNA sequences encoding the various 
acyltransferase sequences identified above. The sequences are listed in Table 3. 



Tables 

Primer Seciuence (listed 5' -3') iiQ"xD" 

NO: 

ATATIF AAGCTTGCATGCGTCGACACAATGGTTCATGCGACC/^GT 163 
CAG 

ATATIR GGTACCGTCGACTCACTTCTTGGTGTTGTTGATAG 164 

ATAT2 F GGATCCGCGGCCGCACAATGACGAGCTTTACTACTTCCCT 165 
TCAT 

ATAT2 R GGATCCCCTGCAGGTTAGAGATCCATTGATTCTGCAAT 166 

ATAT3 F GGATCCGCGGCCGC ATAATGGAATCAGAGCTCAAAGAT 167 

ATAT3R GGATCCCCTGCAGGTCATTCTTCTTTCTGATGGAAATC 168 

ATAT4F GGATCCGCGGCCGCACAATGACTCGTTCACAAGATGTTTC 169 • 
A 

ATAT4 R GGATCCCCTGCAGGTC ACTTCTCTTCCAATCTAGCCAG 170 

ATAT6F GGATCCGCGGCCGCACAATGTCCGGTAATAAGATCTCGAC 171 
TCTTCA 

ATAT6R GGATCCCCTGCAGGTTATTTTTTCTTGACAACTCCGTTAT 172 
TACCGG 

ATAT7F ATATCCGCGGCCGCACAATGGTTATGGAGCAAGCTGGAA 173 

ATAT7 R GGATCCCCTGC AGGTCAATGGAGACAAGGCTCGAAAGT 174 

ATAT8F GGATCCGCGGCCGCACAATGTCCGCCAAGATTTCAATATT 175 
CC 

ATAT8R GGATCCCCTGCAGGTTAATTTTTCTTAACTACTCCATT 176 

ATAT9 F GGATCCGCGGCCGC AC AATGGGAGCTCAGGAGAAACGGGG 177 
CC 

ATAT9R GGATCCCCTGCAGGTCACGTCTTCTCCTTCTTCACCGG . 178 

ATATl OF GGATCCGCGGCCGCACAATGGCGGATCCTGATCTGTCTTC 179 
TCCT 

ATATl OR GGATCCCCTGCAGGTTATGTTGGGGCCAAGTCAGGTGCAA 180 
AGAT 

ATATl IF GGATCCGCGGCCGCAAAATGGAAAAAAAGAGTGTACCAAA 181 
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TTCT 




ATATllR 


GGATCCCCTGCAGGTTATTTGTTTACTAATTTGAGGGAAT 


182 




TTTTTG 




-ji mT TV TV rn 

ATLPAAT 


m/^/~» TV f^/^mf^^ TV f^f~^ TV TV /^/^rnrriTv tv r^r^ tv m^^/^rn/^ TV T'T'OOnv^/^ 

TCGACCTGCAGvjAAGC*TTAAG(jAT<jVjjVLv3AX IvjL. 1^3^^ 


J. o o 


IF 

ATLPAAT 


GGATCCGCGGCCGCTTACTTCTCCTTCTCCG 


184 


IR 

YSCATIF 




1 P R 

± o D 


YSCATIR 


GGATCCCCTvjCACjCju. UAAlCAi 1 J-AUUV^l 1 lovjl i 1A\^ 


± O O 


YSCAT 1 


c 

TV m/-^ m/-i 1 1 M 1 » III 1 1 TV f^f^/^ TV m/^m/^/^m A AAA A /^/^ A O A rri/^ A A iiii I'lriiiii 

ATGTCTTTTAGGGATGTCCTAGAAAGAGvjAvjATvjAATTTT 


Xo / 


KO F 


CTGTGCGGTATTTCACACCG 




YSCAT 1 


TCAATCATCCTTACCCTTTGGTTTACCCTCTGGAGGCAGA 


188 


KO R 


AGATTGTACTGAGAGTGCAC 




YSCAT2F 


TV /"^/^ ^^/^ /^^^/^ TV TV TV fTl/^ TV TV /^/^ A 1 1 II 1 1^^^^/^ A A A A A fTl A f^^^f^ 

GGATCCGCGGCCGCACAATGAAGCATTCCCAAAAATACCG 


1 O O 




TAGG 




YSCAT2R 


GG ATC CC CTGC AGGTC AATG ATTTTTTTTC ATC AC AAATA 


ion 


YSCAT 2 


C 

-» rn -k ■» /-^ -yv mrT^/~»^^ tv tv TV TV TV rr^ tv /^m Tv m TV m/^ TV TV fl If 1 im A ffl/^ 

ATGAAGCATTCCCAAAAATACCGTAGGTATGGAATTTATG 


TOT 


KO F 


CTGTGCGGTATTTCACACCG 




YSCAT 2 


my~i TV TV tv mrr^mmi f ifTimm/^ tv m tv ^ Ti Tv m tv Tv Tv Tv TV m A A A A A A 

TC AATG ATTTTTTTTC ATC AC AAAT AC AAGAA\L AACjAAAA 


X 


KO R 


AGATTGTACTGAGAGTGCAC 




YSCAT 


GGATCCGCGGCCGC ACAATGGGTTTTvj I i vjA 1 1 1 1 1 Cv^A 


X 


3F 


AAC 




YSCAT 


GGATCCCCTGCAGGTTATTTCsVjTC ICAAl i I lAAi Al 1 i 1 


1 QA 

XZ7^L 


3R 


TTTGC 




YbCAT J 


TV m/^ /"« rnrrirTirn/** > i 'u 1^/^ A 1 1 iri in i lO*/^/^ A A A O A A rn a rp/^/^ rny~»i^/^rnrn 
AXi^C^Vjl 1 X Iv^X xvjAJL X XL.ijAAAL.AX AX AXvaol V^vtVjX 1 


X 17 Q 


KO F 


CTGTGCGGTATTTCACACCG 




YbL-Ai -3 


mm A mrnTiO/^rn/^T*/^ A A rnrnmm A a A rnrpmrnrnrnrp/^/-^ a A/^O a 

X X A X X I vjLj i L X L. AA X X X X AA XAXXXXXXX IjL. AAVjvjAL. x V-Va 


X •? w 


K\J K 


TV TV mrni^rp A OT^r^ A/~« A rn/~i A O 

ALsA X X vj X AL X LjAVjAIj X vjL.AL- 






LjLjAX LL.C3L-VjVaL.L.VjL-AL.AAX VjLjAAAAVj X AV— A^^V^AAJ. X Vj*j/\\3 


1 Q7 

J. -7 / 




A A O 

ALjAL, 






\jLjAXL.LL.L X\jL,ALjL7L- XAL-X XL. v. XL. X X X X XAL.iaX 1 Lx/\ X L.L7L- 


X Z7 O 


4R 


TG 




YSCAT 4 


ATGGAAAAGTACACCAATTGGAGAGACAATGGTACGGGAA 


199 


KO F 


CTGTGCGGTATTTCACACCG 




YSCAT 4 


CTACTTCCTCTTTTTACGTTGATCGCTGATATATTCCTTC 


200 


KO R 


AGATTGTACTGAGAGTGCAC 
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YSCAT 




GGATCCGCGGCCGCACAATGCCTGCACCAAAACTCACGGA 


201 


5F 




G 




YSCAT 




GGATCCCCTGCAGGCTACGCATCTCCTTCTTTCCCTTC 


. 202 


5R 








YSCAT 


5 


ATGCCTGCACCAAAACTCACGGAGAAATCTGCCTCTTCCA 


203 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


5 


CTACGCATCTCCTTCTTTCCCTTCTTCTTCTTCTTCCTCT 


204 


KO R 




AGATTGTACTGAGAGTGCAC 




YSCAT 




GGATCCGCGGCCGCACAATGTCTGCTCCCGCTGCCGATCA 


205 


6F 




TAACGC 




YSCAT 




GGATCCCCTGCAGGTCATTCTTTCTTTTCGTGTTCTCTTT 


206 


6R 




TCTG 




YSCAT 


6 


ATGTCTGCTCCCGCTGCCGATCATAACGCTGCCAAACCTA 


207 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


6 


TCATTCTTTCTTTTCGTGTTCTCTTTTCTGTCTTACCAGC 


208 . 


KO R 




AGATTGTACTGAGAGTGCAC 




YSCAT 




GGATCCGCGGCCGCACAATGCTGCATCAAAAAATAGCTCA 


209 


7F 




TAAAGTTCG 




YSCAT 




GGATCCCCTGCAGGTCAAAAAATAAAACAATAAAGTTTAT 


210 


7R 




AAACTAACC 




YSCAT 


7 


ATGCTGCATCAAAAAATAGCTCATAAAGTTCGAAAAGTCG 


211 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


7 


TCAAAAAATAAAACAATAAAGTTTATAAACTAACCAAATT 


212 


KO R 




AGATTGTACTGAGAGTGCAC 




YSCAT 




GGATCCGCGGCCGCACAATGAGTGTGATAGGTAGGTTCTT 


213 


8F 




G 




YSCAT 




GGATCCCCTGCAGGTTAATGCATCTTTTTTACAGATGAAC 


214 


8R 




C 




YSCAT 


8 


ATGAGTGTGATAGGTAGGTTCTTGTATTACTTGAGGTCCG 


215 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


8 


TTAATGCATCTTTTTTACAGATGAACCTTCGTTATGGGTA 


216 


KO R 




AGATTGTACTGAGAGTGCAC 





The entire coding regions for each of the acyltransferase sequences were amplified 
using the respective primers listed in the Table 3 above, cloned into the vector pCR2.1Topo 
(Invitrogen) orpZero (Invitrogen), and labeled as pCGN8558 (ATATl), pCGN8564 
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(ATAT2), pCGB8565 (ATAT3), pCGN8566 (ATAT4). pCGN8918 (ATAT6), 
pCGN8913 (ATAT7), pCGN8904 (ATAT8), pCGN9970 (ATAT9), pCGN9940 
(ATATIO). pCGN8567 (ATATl 1), pCGN8632 (ATLPAATl). pCGN9901 (VSCATl 
also referred lo as g>2132299), pCGN9902 (YSCAT2, also referred lo as gi 1078509), 
pCGN9903 (YSCAT3, also referred to as gi2132939), pCGN9904 (YS.CAT4, also 
referred to gi21 3303 1), pCGN9905 (YSCAT5, also referred to as gi320748), pCGN9906 
(YSCAT6, also referred to as gi549627). pCGN9907 (YSCAT7, also referred to as 
§1586485), and pCGN9908 (YSCAT8, also referred to as gi464422). The nucleic acid 
sequences for the respective yeast acyltransferase are provided YSCATl (SEQ ID 
NO:225), YSCAT2 (SEQ ID NO:226). YSCAT3 (SEQ ID NO:227), YSCAT4 (SEQ ID 
NO:228), YSCAT5 (SEQ ID NO:229), YSCAT6 (SEQ ID NO:230), YSCAT7 (SEQ ID 
NO:231), and YSCAT8 (SEQ ID NO:232). 
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7A. Baculovirus Expression Constructs 

Constructs are prepared to direct the expression of the Arabidopsis ATAT sequences 

in cultured insect cells. The entire coding regions of AT ATI. 2, 3, 4, 6, 7, 8, 9. 10, and 1 1 are 

cloned into the vector pFastBacl (Gibco-BRL, Gaithersburg, MD) digested W\\hNotl and 

5 Pstl. The respective coding sequences were cloned as /V<7rI/55e8387I fragments. Double- 

stranded DNA sequence was obtained to verify that no errors were introduced by PCR 

amplification. The resulting plasmid were designated pCGN9723 (ATATl), pCGN9724 

« 

I (ATAT2), pCGN9725 (ATAT3), pCGN9726 (ATAT4), pCGN9727 (ATAT5), pCGN9728 
\ (ATAT7), pCGN9729 (ATAT8), pCGN9991 (ATAT9) pCGN9730 (ATATIO), pCGN9731 
10 (ATATl 1). 



7B. Plant Expression Construct Preparation 

A plasmid containing the napin cassette derived from pCGN3223 (described in USPN 
5,639,790, the entirety of which is incorporated herein by reference) was modified to make it 
15 more useful for cloning large DNA fragments containing multiple restriction sites, and to 

allow the cloning of multiple napin fusion genes into plant binary transformation vectors. An 
adapter comprised of the self annealed oligonucleotide of sequence 

CGCGATTTAAATGGCGCGCCCTGCAGGCGGCCGCCTGCAGGGCGCGCCATTTAA 

(SEQ ID NO:233) AT was ligated into the cloning vector pBC SK+ (Stratagene) after 
20 digestion with the restriction endonuclease BssHII to construct vector pCGN7765. Plamids 

pCGN3223 and pCGN7765 were digested with NotI and ligated together. The resultant 
' vector, pCGN7770, contains the pCGN7765 backbone with the napin seed specific 

expression cassette from pCGN3223. 

The cloning cassette, pCGN7787, essentially the same regulatory elements as 
25 pCGN7770, with the exception of the napin regulatory regions of pCGN7770 have been 

replaced with the double CAMV 35S promoter and the tml polyadenylation and 

transcriptional termination region. 

A binary vector for plant transformation, pCGN5139, was constructed from 

pCGN1558 (McBride and Summerfelt, (1990) Plant Molecular Biology, 14:269-276). The 
30 polylinker of pCGN1558 was replaced as a Hindlll/Asp718 fragment with apolylinker 

containing unique restriction endonuclease sites, AscI, Pad, Xbal, Swal, BamHI,and Notl. 

The Asp718 and Hindlll restriction endonuclease sites are retained in pCGN5139. 
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A series of turbo binary vectors are constructed to allow for the rapid cloning of DNA 
sequences into binary vectors containing transcriptional initiation regions (promoters) and 
transcriptional termination regions. 

The plasmid pCGN8618 was constructed by ligating oligonucleotides 5'- 
5 TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3' ) (SEQ ID NO:234) and 5'- 

TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC.3' ) (SEQ ID NO:235) into Sall/Xhol- 
digested pCGN7770. A fragment containing the napin promoter, poly linker and napin 3* 
region was excised from pCGN8618 by digestion with Asp718I; the fragment was blunt- 
ended by filling in the 5' overhangs with Klenow fragment then ligated into pCGN5139 that 

10 had been digested with Asp718I and Hindlll and blunt-ended by filling in the 5' overhangs 
with Klenow fragment. A plasmid containing the insert oriented so that the napin promoter 
was closest to the blunted Asp718I site of pCGN5139 and the napin 3' was closest to the 
blunted Hindlll site was subjected to sequence analysis to confirm both the insert orientation 
and the integrity of cloning junctions. The resulting plasmid was designated pCGN8622. 

15 The plasmid pCGN8619 was constructed by ligating oligonucleotides 5*- 

TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC -3' ) (SEQ ID NO:236) and 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3' ) (SEQ ID NO:237) into Sall/Xhol- 
digested pCGN7770. A fragment containing the napin promoter, poly linker and napin 3' 
region was removed from pCGN8619 by digestion with Asp718I; the fragment was blunt- 

2 0 ended by filling in the 5' overhangs with KJenow fragment then ligated into pCGN5139 that 
had been digested with Asp718I and Hindlll and blunt-ended by filling in the 5' overhangs 
with Klenow fragment. A plasmid containing the insert oriented so that the napin promoter 
was closest to the blunted Asp718I site of pCGN5139 and the napin 3' was closest to the 
blunted Hindlll site was subjected to sequence analysis to confirm both the insert orientation 

25 and the integrity of cloning junctions. The resulting plasmid was designated pCGN8623. 
The plasmid pCGN8620 was constructed by ligating oligonucleotides 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGGAGCT -3' ) (SEQ ID NO:238) and 5'- 
CCTGCAGGAAGCTTGCGGCCGCGGATCC-3' ) (SEQ ID NO:239) into SalUSacI- 
digested pCGN7787. A fragment containing the d35S promoter, polylinker and tml 3' region 

30 was removed from pCGN8620 by complete digestion with Asp718I and partial digestion with 
Notl. The fragment was blunt-ended by filling in the 5* overhangs with Klenow fragment 
then ligated into pCGN5139 that had been digested with Asp718I and Hindlll and blunt- 
ended by filling in the 5' overhangs with KJenow fragment. A plasmid containing the insert 
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oriented so that the d35S promoter was closest to the blunted Asp718I site of pCGN5139 and 
the tml 3' was closest to the blunted Hindlll site was subjected to sequence analysis to 
confirm both the insert orientation and the integrity of cloning junctions. The resulting 
plasmid was designated pCGN8624. 
5 The plasmid pCGN8621 was constructed by ligating oligonucleotides'5*- 

TCGACCTGCAGGAAGCTTGCGGCCGCGGATCCAGCT -3' ) (SEQ ID NO:240) and 5'- 
GGATCCGCGGCCGCAAGCTTCCTGCAGG-3* ) (SEQ ID NO:241) into Sall/Sacl- 
digested pCGN7787. A fragment containing the d35S promoter, poly linker and tml 3' region 
was removed from pCGN8621 by complete digestion with Asp7181 and partial digestion with 
10 Notl. The fragment was blunt-ended by filling in the 5' overhangs with Klenow fragment 
then ligated into pCGN5139 that had been digested with Asp718I and Hindlll and blunt- 
ended by filling in the 5' overhangs with Klenow fragment. A plasmid containing the insert 
oriented so that the d35S promoter was closest to the blunted Asp718I site of pCGN5139 and 

i 

the tml 3' was closest to the blunted Hindlll site was subjected to sequence analysis to 

15 confirm both the insert orientation and the integrity of cloning junctions. The resulting 
plasmid was designated pCGN862S. 

The coding regions of the various acyltransferase' sequences were cloned as 
NotySseS3S71 fragments into pCGN8622, pCGN8623, pCGN8624, and pCGN8625, for 
expression in sense or antisense orientations from a tissue preferential promoter, napin, or the 

2 0 35S promoter. Fragments which were cloned into the pCGN8622 vector created the 

constructs pCGN8901 (ATATl). pCGN8571 (ATAT2), pCGN8909 (ATAT3), pCGN8596 
(ATAT4), pCGN8919 (ATAT6), pCGN8914 (ATAT7). pCGN8905 (ATAT8), pCGN9973 
(ATAT9), pCGN9942 (ATATIO). pCGN8575 (ATATl 1), and pCGN8633 (ATLPAATl) for 
the sense expression of the respective coding sequences from the napin promoter. Fragments 

25 which were cloned into the pCGN8623 vector created the constructs pCGN8900 (ATATl), 
pCGN8572 (ATAT2), pCGN8910 (AT ATS), pCGN8597 (ATAT4), pCGN8920 (ATAT6), 
pCGN8915 (ATAT7), pCGN8906 (ATAT8), pCGN9972 (ATAT9), pCGN9943 (ATATIO), 
pCGN8576 (ATATl 1), and pCGN8634 (ATLPAATl) for the antisense expression of the 
respective coding sequences from the napin promoter. Fragments which were cloned into the 

3 0 pCGN8624 vector created the constructs pCGN8903 (ATATl), pCGN8573 (ATAT2), 

pCGN891 1 (ATAT3), pCGN8598 (ATAT4), pCGN8921 (ATAT6), pCGN8916 (ATAT7), 
pCGN8907 (ATAT8), pCGN9971 (ATAT9), pCGN9944 (ATATIO), pCGN8577 (ATATl 1), 
and pCGN8635 (ATLPAATl) for the sense expression of the respective coding sequences 
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from the 35S promoter. Fragments which were cloned into the pCGN8625 vector created the 
cbonstructs pCGN8902 (ATATl) and pCGN9974 (ATAT9) for the antisense expression of 
the respective coding sequences from the 35S promoter. 

In addition, the yeast acyltransferase coding sequences were cloned into the vector 
5 pCGN8624 creating the constructs pCGN9926 ( YSCAT 1 ), pCGN9927 ( YSCAT2). 
pCGN9928 (YSCAT3), pCGN9929 (YSCAT4), pCGN9930 (YSCAT5), pCGN9931 
(YSCAT6), pCGN9932 (YSCAT7), and pCGN9933 (YSCATS). These constructs allow for 
I the sense expression of the respective acyltransferase coding sequences from the 35S 
\ promoter in plant cells. 

10 

Example 8: Plant Transformation 

A variety of methods have been developed to insert a DNA sequence of interest into the 
15 genome of a plant host to obtain the transcription or transcription and translation of the sequence 
to effect phenotypic changes. 

Transgenic Brassica plants are obtained hy Agrobacterium-mQd\zX.^d transformation 
as described by Radke et al {Theor. Appl Genet. (1988) 75:685-694; Plant Cell Reports 
(1992) 77:499-505). Transgenic Arabidopsis thaliana plants may be obtained by 

2 0 Agrobacterium-mcAidXcd transformation as described by Valverkens et aL, (Proc. Nat. Acad. 

ScL (1988) 55:5536-5540), or as described by Bent et al. ((1994), Science 265:1856-1860), or 
Bechtold et al. ((1993), C.R.Acad.Sci, Life Sciences 316: 11 94-1 199) or Clough, et al (1998) 
Plant 7., 16:735-43. Other plant species may be similarly transformed using related 
techniques. 

25 Ahematively, microprojectile bombardment methods, such as described by Klein et 

al. {Bio/Technology 70:286-291) may also be used to obtain nuclear transformed plants. 

The above results demonstrate that the nucleic acid sequences identified encode 
proteins which are related to protein sequences encoding acyltransferase proteins. Such 

3 0 acyltransferase sequences find use in preparing expression constructs for plant 

transformations. 

All publications and patent applications mentioned in this specification are indicative 
of the level of skill of those skilled in the art to which this invention pertains. All 
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publications and patent applications are herein incorporated by reference to the same extent as 
if each individual publication or patent application was specifically and individually indicated 
to be incorporated by reference. 

Although the foregoing invention has been described in some detail by way of 
illustration and example for purposes of clarity of understanding, it will be obvious that 
cenain changes and modifications may be practiced within the scope of the appended claim. 
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Claims 

What is Claimed is: 

1. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
5 proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 127 
(VxNHxS) wherein the H is the conserved Histidine residue in the conserved peptide 
sequence HXXXXD of said acyltransferase-like protein, x representing any amino acid. 

2. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 128 
(VTYSxS) within about 30 amino acids downstream from the conserved amino acid sequence 
HXXXXD of said acyltransferase-like protein, x representing any amino acid. 

3. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 129 
(VxLTRxR) within about 60 amino acids downstream from the conserved amino acid 
sequence HXXXXD of said acyltransferase-like protein, x representing any amino acid. 

4. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 132 
(LxxGDLV) within about 20 amino acids upstream of the conserved amino acid sequence 
PEG of said acyltransferase-like protein, x representing any amino acid. 

5. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

3 0 wherein said enzyme includes the amino acid sequence of SEQ ID NO: 1 30 (CPEGT) 

containing the conserved amino acid sequence PEG of said acyltransferase-like protein. 
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6. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 133 
(FxxGAF) within about 20 amino acids downstream from the conserved amino acid sequence 
PEG of said acyltransferase-like protein, x representing any amino acid. 

7. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 131 (IVPVA) 
within about 40 amino acids downstream from the conserved amino acid sequence PEG of 
said acyltransferase-like protein. 

8. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 134 
(VANxxQ) within about 1 10 amino acids downstream from the conserved amino acid 
sequence PEG of said acyltransferase-like protein, x representing any amino acid. 

9. A DNA sequence encoding an enzyme of the class of acyltransferase-like proteins, 
said DNA sequence obtainable by the steps comprising: 

( a ) using the profile of Figure 1 to search a nucleic acid sequence database; 

(b) obtaining a probability score for nucleic acid sequences in said sequence 
database using the Smith- Waterman algorithm; and 

( c ) selecting a nucleic acid sequence having a probability score of less than about 1. 

10. The DNA encoding sequence according to Claim 9. wherein said DNA sequence 
is an encoding sequence. 

11. The DNA encoding sequence according to Claim 9, wherein said DNA sequence 
is an EST. 
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12. The DNA encoding sequence according to any one of Claims 1 to 11, wherein 
said acyltransferase-like protein is from a plant 

13. A construct comprising a DNA sequence of any one of Claims 1 to 1 1 linked to a 
5 heterologous transcriptional and translational initiation region functional in a host cell. 

14. The construct according to Claim 13 wherein said host cell is a plant cell. 

15. A plant cell comprising a DNA construct according to Claim 13. 

10 

16. A plant comprising a cell according to Claim 15. 

17. The DNA encoding sequence of any one of 1 to 1 1 wherein said acyltransferase- 
15 like protein is from Arabidopsis thaliana. 

18. The DNA encoding sequence of any one of I to 1 1 wherein said acyltransferase- 
like protein is from com. 

2 0 19 . The DNA encoding sequence of Claim 18 wherein said sequence comprises and 

EST selected from the group consisting of SEQ ID NO: 86 through SEQ ID NO: 126. 

2 0 . The DNA encoding sequence of any one of 1 to 1 1 wherein said acyltransferase- 
like protein is from soybean. 

25 

2 1 . The DNA encoding sequence of Claim 20 wherein said sequence comprises and 
EST selected from the group consisting of SEQ ID NO: 24 through SEQ ID NO: 85. 

22 . The DNA encoding sequence of any one of Claims 2, 3, 4, 5, 7 and 8 wherein 

3 0 said acyltransferase-like protein is selected from the group consisting of SEQ ID NO: 1 , SEQ 

ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14 and SEQ ID NO: 16 
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23 . The DNA encoding sequence of either of Claim 1 and Claim 6 wherein said 
acyltransferase-like protein is selected from the group consisting of SEQ ID NO: 3, SEQ ID 
NO: 5, SEQ ID NO: 7 and SEQ ID NO: 18. 
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SEQUENCE LISTING 



PCT/US99/22231 



<110> Lassner, Michael W 
Emig, Robin A 
Ruezinsky, Diane 
Van Eenennaam, Alison 

<120> Novel Plant Acyl transferases 



<130> 17029/00/WO 



<140> 
<141> 



<150> 60/101,939 
<151> 1998-09-25 



<160> 241 



<170> Patentin Ver . 2.0 



<210> 1 
<211> 869 
<212> DNA 

<213> Arabidopsis sp . 



<400> 1 

atggttcatg cgaccaagtc agccacaacg 
gtcttccatg atgggcgttt agcgcaacgt 
ctatggcttc cttttggttt catctctcca 
ctgaaagatt tgtccgttac acttacgaga 
atcgtcctcc acctccttcc cctggaactc 
ccgcgcttga tcccatcatc gtcgctattg 
acagtgtctc tcgtctctcc cttatgcttt 
accgtgccac cgatgctgcc aacatgagaa 
gtccggaagg cacgacgtgt agagaagagt 
agctaagcga ccggattgtg ccagtagcga 
ccacagttag gggtgtgaag ttttgggacc 
gctatgaagc cactttcttg gatcgtttgc 
agactcctat agaggtggct aattacgtcc 
aatgcaccga acttactcgc aaggataaat 
tggagtctat caacaacacc aagaagtga 

<210> 2 
<211> 289 
<212> PRT 

<213> Arabidopsis sp. 



attccaaaag aacgcttaaa gaaccgcata 60 
ccaactccgt taaacgccat tatcacatac 12 0 
tcattcgcgt ctacttcaac ctccctttac 180 
tgctcgggat ccacttaacc attcgtggtc 240 
ttggcaacct ctatgtcctt aaccaccgta 300 
ctcttggacg taagatctgt tgcgtcactt 360 
ctcctattcc tgctgttgcc ctcacccgtg 420 
aacttctcga gaaaggcgac ttggtgatat 480 
atctactgag atttagcgct ctattcgcag 540 
tgaactgtaa acaaggaatg ttcaacggga 600 
cttacttctt cttcatgaac ccaagaccaa 660 
ctgaagaaat gactgtcaac ggtggtggca 720 
agaaagttat cggcgcggtt ttgggcttcg 780 
atcttttgct tggaggtaat gacggcaagg 840 



Me?°Val His Ala Thr Lys Ser Ala Thr Thr lie Pro Lys Glu Arg Leu 
15 10 

Lys Asn Arg He Val Phe His Asp Gly Arg Leu Ala Gin Arg Pro Thr 

25 



20 



Pro Leu Asn Ala He He Thr Tyr Leu Trp Leu Pro Phe Gly Phe He 

35 40 45 

Leu ser He He Arg Val Tyr Phe Asn Leu Pro Leu Pro Glu Arg Phe 

50 55 60 

Val Arg Tyr Thr Tyr Glu Met Leu Gly He His Leu Thr He Arg Gly 

65 70 75 »u 

His Arg Pro Pro Pro Pro Ser Pro Gly Thr Leu Gly Asn Leu Tyr Val 



85 90 95 

1 

100 



Leu Asn His Arg Thr Ala Leu Asp Pro He He Val Ala He Ala Leu 

105 110 
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SEQUENCE LISTING 



<110> Lassner, Michael W 
Emig, Robin A 
Ruezinsky, Diane 
Van Eenennaam, Alison 

<120> Novel Plant Acyl transferases 



<130> 17029/00/WO 



<140> 
<141> 



<150> 60/101,939 
<151> 1998-09-25 



<160> 241 



<170> Patentin Ver . 2.0 

<210> 1 
<211> 869 
<212> DNA 

<213> Arabidopsis sp. 



<400> 1 

atggttcatg cgaccaagtc agccacaacg 
gtcttccatg atgggcgttt agcgcaacgt 
ctatggcttc cttttggttt catctctcca 
ctgaaagatt tgtccgttac acttacgaga 
atcgtcctcc acctccttcc cctggaactc 
ccgcgcttga tcccatcatc gtcgctattg 
acagtgtctc tcgtctctcc cttatgcttt 
accgtgccac cgatgctgcc aacatgagaa 
gtccggaagg cacgacgtgt agagaagagt 
agctaagcga ccggattgtg ccagtagcga 
ccacagttag gggtgtgaag ttttgggacc 
gctatgaagc cactttcttg gatcgtttgc 
agactcctat agaggtggct aattacgtcc 
aatgcaccga acttactcgc aaggataaat 
tggagtctat caacaacacc aagaagtga 



attccaaaag aacgcttaaa gaaccgcata 60 
ccaactccgt taaacgccat tatcacatac 120 
tcattcgcgt ctacttcaac ctccctttac 180 
tgctcgggat ccacttaacc attcgtggtc 240 
ttggcaacct ctatgtcctt aaccaccgta 300 
ctcttggacg taagatctgt tgcgtcactt 360 
ctcctattcc tgctgttgcc ctcacccgtg 420 
aacttctcga gaaaggcgac ttggtgatat 480 
atctactgag atttagcgct ctattcgcag 540 
tgaactgtaa acaaggaatg ttcaacggga 600 
cttacttctt cttcatgaac ccaagaccaa 660. 
ctgaagaaat gactgtcaac ggtggtggca 720 
agaaagttat cggcgcggtt ttgggcttcg 780 
atcttttgct tggaggtaat gacggcaagg 840 



<210> 2 
<211> 289 
<212> PRT 

<213> Arabidopsis sp. 



<400> 2 

Met Val His Ala 
1 

Lys Asn Arg lie 
20 

Pro Leu Asn Ala 
35 

Leu Ser lie lie 
50 

Val Arg Tyr Thr 
65 

His Arg Pro Pro 



Leu Asn His Arg 
100 



Thr Lys Ser Ala 
5 

Val Phe His Asp 



lie lie Thr Tyr 
40 

Arg Val Tyr Phe 

55 

Tyr Glu Met Leu 
70 

Pro Pro Ser Pro 
85 

Thr Ala Leu Asp 



Thr Thr lie Pro 
10 

Gly Arg Leu Ala 

25 

Leu Trp Leu Pro 



Asn Leu Pro Leu 
60 

Gly lie His Leu 
75 

Gly Thr Leu Gly 
90 

Pro lie lie Val 
105 



Lys Glu Arg Leu 
15 

Gin Arg Pro Thr 

30 

Phe Gly Phe He 
45 

Pro Glu Arg Phe 



Thr He Arg Gly 
80 

Asn Leu Tyr Val 
95 

Ala He Ala Leu 
110 
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Gly Arg Lys He Cys Cys Val Thr Tyr Ser Val Ser Arg Leu Ser Leu 
115 * 120 125 

Met Leu Ser Pro He Pro Ala Val Ala Leu Thr Arg Asp Arg Ala Thr 
130 135 140 

Asp Ala Ala Asn Met Arg Lys Leu Leu Glu Lys Gly Asp Leu Val He 
145 150 155 160 

Cys Pro Glu Gly Thr Thr Cys Arg Glu Glu Tyr Leu Leu Arg Phe Ser 
165 170 175 

Ala Leu Phe Ala Glu Leu Ser Asp Arg He Val Pro Val Ala Met Asn 
180 185 190 

Cys Lys Gin Gly Met Phe Asn Gly Thr Thr Val Arg Gly Val Lys Phe 
195 200 205 

Trp Asp Pro Tyr Phe Phe Phe Met Asn Pro Arg Pro Ser Tyr Glu Ala 

210 215 220 

Thr Phe Leu Asp Arg Leu Pro Glu Glu Met Thr Val Asn Gly Gly Gly 
225 230 235 240 

Lys Thr Pro He Glu Val Ala Asn Tyr Val Gin Lys Val He Gly Ala 

245 250 255 

Val Leu Gly Phe Glu Cys Thr Glu Leu Thr Arg Lys Asp Lys Tyr Leu 
260 265 270 

Leu Leu Gly Gly Asn Asp Gly Lys Val Glu' Ser He Asn Asn Thr Lys 
275 280 285 

Lys 



<210> 3 
<211> 939 
<212> DNA 

<213> Arabidopsis sp. 



<400> 3 

atgacgagct 

agacgtactg 

gataagaaat 

tcaggagctg 

ctcagaggga 

atgattattg 

ttcattgcta 

ggtttggaga 

tttctggata 

gggatattcg 

aagcggatgg 

aagggagcat 

tctttcaaga 

acgctaatgg 

aatgtgagag 

gaggccagaa 



ttactacttc 
gcattcaatg 
cacctagatc 
caacccctga 
tattcttttg 
ggcatccgtt 
aactttgggc 
atctgccatc 
tctacacact 
taattcccat 
acccaagaag 
ctgtgttttt 
aaggcgcatt 
gaacaggcaa 
ttatcatcca 
gcaagattgc 



ccttcatgct 
gtctaaccgc 
aagtcaattg 
ctcttctttt 
tgttgttgct 
cgtccttctc 
ttccataagc 
atcagacact 
tcttagtctt 
catcggttgg 
ccaagtggat 
cttcccagaa 
tacagtggct 
aatcatgcca 
taaaccaata 
agaatcaatg 



gtcccgagtg 
tctttaagac 
gcaagagata 
cctgaaccag 
ggcatttcgg 
ttcgatccct 
atttatccgt 
cctgctgtat 
ggaaaaagct 
gccatgtcca 
tgcttaaaac 
ggaacacgga 
gcgaagaccg 
acgggtagtg 
catggaagca 
gatctctaa 



aaaaatttat 
atgatcctta 
tcactgtgag 
agattaagtt 
ctacttttct 
ataggagaaa 
tttacaaaat 
atgtttcaaa 
ttaagt teat 
tgatgggtgt 
gctgcatgga 
gtaaggatgg 
gagttgcagt 
aaggtatact 
aagcggatgt 



gggcgaaaca 
cagatttctt 
agcagatctt 
gagctcaaga 
cattgtcctg 
attccaccac 
caacatcgag 
ccaccaaagt 
cagcaagaca 
cgttcccttg 
acttttaaag 
tcggttaggt 
agttccaata 
gaaccatggg 
tctttgcaac 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

939 



<210> 4 
<211> 312 
<212> PRT 

<213> Arabidopsis sp. 



<400> 4 

Met Thr Ser Phe Thr Thr Ser Leu His Ala Val Pro Ser Glu Lys Phe 

15 10 15 



Met Gly Glu Thr Arg Arg Thr Gly He Gin Trp Ser Asn Arg Ser Leu 



wo 00/18889 , PCTAJS99/22231 

4 

20 25 30 

Arg His Asp Pro Tyr Arg Phe Leu Asp Lys Lys Ser Pro Arg Ser Ser 
35 40 45 

Glh Leu Ala Arg Asp lie Thr Val Arg Ala Asp Leu Ser Gly Ala Ala 
50 55 60 

Thr Pro Asp Ser Ser Phe Pro Glu Pro Glu lie Lys Leu Ser Ser Arg 
65 70 75 80 . 

Leu Arg Gly lie Phe Phe Cys Val Val Ala Gly He Ser Ala Thr Phe 
85 90 95 

Leu He Val Leu Met He He Gly His Pro Phe Val Leu Leu Phe* Asp 
100 105 110 

Pro Tyr Arg Arg Lys Phe His His Phe He Ala Lys Leu Trp Ala Ser 
115 120 125 

He Ser He Tyr Pro Phe Tyr Lys He Asn He Glu Gly Leu Glu Asn - 
130 135 • 140 

Leu Pro Ser Ser Asp Thr Pro Ala Val Tyr Val Ser Asn His Gin Ser 
145 150 155 160 

Phe Leu Asp He Tyr Thr Leu Leu Ser Leu Gly Lys Ser Phe Lys Phe 
165 170 175 

He Ser Lys Thr Gly He Phe Val He Pro He He Gly Trp Ala Met 
180 185 190 

Ser Met Met Gly Val Val Pro Leu Lys Arg Met Asp Pro Arg Ser Gin 
195 200 205 

Val Asp Cys Leu Lys Arg Cys Met Glu Leu Leu Lys Lys Gly Ala Ser 
210 . 215 220 

Val Phe Phe Phe Pro Glu Gly Thr Arg Ser Lys Asp Gly Arg Leu Gly 
225 230 235 240 

Ser Phe Lys Lys Gly Ala Phe Thr Val Ala Ala Lys Thr Gly Val Ala 

245 ■ 250 255 

Val Val Pro He Thr Leu Met Gly Thr Gly Lys He Met Pro Thr Gly 
260 265 270 

Ser Glu Gly He Leu Asn His Gly Asn Val Arg Val He He His Lys 
275 280 285 

Pro He His Gly Ser Lys Ala Asp Val Leu Cys Asn Glu Ala Arg Ser 
290 295 300 

Lys He Ala Glu Ser Met Asp Leu 
305 310 

<210> 5 
<211> 1197 
<212> DNA 

<213> Arabidopsis sp. 
<400> 5 

atggaatcag agctcaaaga tttgaattcg aattcgaatc ctccgtcgag caaagaggac 60 
cggccgttac tgaaatcaga atccgatttg gcggctgcca ttgaagagtt agacaaaaag 120 
ttcgcacctt acgcgaggac cgatttgtat gggacgatgg gtttgggtcc tttcccgatg 180 
acggagaata ttaaattggc ggttgcattg gtgactcttg ttccattgcg gtttcttctc 240 
tcgatgagca tcttgcttct ctattacttg atttgtaggg tatttacgct gttttctgct 3 00 
ccttatcgtg ggccagagga agaggaagat gaaggtggag ttgtttttca ggaagattat 360 
gctcacatgg aaggttggaa acggactgtt atcgtccggt ctgggaggtt tctctctagg 420 
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gttttgcttt 
gacatggatt 
acggaggaac 
ttgtatcata 
cctcttgttg 
tcgcctgatt 
aataaatctg 
ttacttacat 
aaatatccgt 
ttattccttc 
1020 

ccatcccaag 
1080 

gccaccgagg 
1140 

gcaactctca 
1197 



tcgtttttgg 
ctaatcctaa 
ctgaaagacc 
tgtctgcttc 
gcctcattag 
tcaagggtgt 
ctccaactat 
tcaagacagg 
atgagcgctt 
tctgtcaagt 



gttttattgg 
aactacttct 
tggagccatt 
ttttccaagt 
caaatgcctt 
atctggcaca 
tatgcttttt 
tgcatttttg 
cagtgtggca 
cgtaaatcac 



attcacgaga 
acagagatta 
gtgtccaatc 
tttgttgcca 
ggttgtgtct 
gtaaatgaaa 
ccagaaggaa 
gctggaactc 
tgggatacca 
ttggaagtca 



gctgtccaga 
accagaaagg 
atgtttcgta 
agagatcagt 
^tgttcaaag 
gagttcgaga 
caactaccaa 
cagttcttcc 
tatccggggc 
tacggttacc 



tcgagattca 
ggaagccgcc 
cttggacatt 
gggcaaactt 
agaagcaaaa 
agctcatagc 
tggagactac 
ggtaatatta 
acgccacatt 
tgtatactac 



480 
540 
600 
660 
720 
780 
840 
900 
960 



aagagaaaga cgatcccaaa ctttatgcta gcaatgttcg gaaattaatg 
gtaacttigat tctatcggag ttgggactta gcgacaaaag gatatatcac 
atggtaatct tagtcaaacc cgtgatttcc atcagaaaga agaatga 



<210> 6 
<211> 398 
<212> PRT 

<213> Arabidopsis sp. 



<400> 6 

Met Glu Ser Glu 
1 



Ser Lys Glu Asp 
20 



Ala lie Glu Glu 

35 



Leu Tyr Gly Thr 
50 



Lys Leu Ala Val 
65 



Ser Met Ser lie 



Leu Phe Ser Ala 

100 



Gly Val Val Phe 
115 



Thr Val He Val 
130 



Val Phe Gly Phe 
145 



Asp Met Asp Ser 



Gly Glu Ala Ala 
180 



Asn His Val Ser 
195 



Pro Ser Phe Val 
210 



Leu He Ser Lys 
225 



Ser Pro Asp Phe 



Leu Lys Asp Leu 
5 



Arg Pro Leu Leu 



Leu Asp Lys Lys 
40 



Met Gly Leu Gly 
55 

Ala Leu Val Thr 
70 



Leu Leu Leu Tyr 
85 

Pro Tyr Arg Gly 



Gin Glu Asp Tyr 
120 



Arg Ser Gly Arg 
135 

Tyr Trp He His 
150 

Asn Pro Lys Thr 

165 



Thr Glu Glu Pro 



Tyr Leu Asp He 

200 



Ala Lys Arg Ser 
215 



Cys Leu Gly Cys 
230 

Lys Gly Val Ser 



Asn Ser Asn Ser 
10 

Lys Ser Glu Ser 
25 

Phe Alai Pro Tyr 



Pro Phe Pro, Met 
60 

Leu Val Pro Leu 

Tyr Leu He Cys 
90 

Pro Glu Glu Glu 
105 

Ala His Met Glu 



Phe Leu Ser Arg 

140 

Glu Ser Cys Pro 
155 

Thr Ser Thr Glu 
170 

Glu Arg Pro Gly 
185 

Leu Tyr His Met 



Val Gly Lys Leu 
220 

Val Tyr Val Gin 

235 

Gly Thr Val Asn 



Asn Pro Pro Ser 
15 

Asp Leu Ala Ala 
30 

Ala Arg Thr Asp 
45 

Thr Glu Asn He 



Arg Phe Leu Leu 
80 

Arg Val Phe Thr 
95 

Glu Asp Glu Gly 

110 

Gly Trp Lys Arg 
125 

Val Leu Leu Phe 



Asp Arg Asp Ser 
160 

He Asn Gin Lys 
175 

Ala He Val Ser 
190 

Ser Ala Ser Phe 

205 

Pro Leu Val Gly 



Arg Glu Ala Lys 
240 

Glu Arg Val Arg 
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245 

Glu Ala His Ser Asn 
260 

Gly Thr Thr Thr Asn 
275 

Phe Leu Ala Gly Thr 
290 

Glu Arg Phe Ser Val 
305 

Leu Phe Leu Leu Cys 
325 

Pro Val Tyr Tyr Pro 
340 

Ala Ser Asn Val Arg 
355 

Ser Glu Leu Gly Leu 
370 

Gly Asn Leu Ser Gin 
385 



250 

Lys Ser Ala Pro Thr lie 
265 

Gly Asp Tyr Leu Leu Thr 
280' 

Pro Val Leu Pro Val lie 
295 

Ala Trp Asp Thr lie Ser 
310 315 

Gin Val Val Asn His Leu 
330 

Ser Gin Glu Glu Lys Asp 
345 

Lys Leu Met Ala Thr Glu 
360 

Ser Asp Lys Arg lie Tyr 
375 • 

Thr Arg Asp Phe His Gin 
390 395 



255 

Met Leu Phe Pro Glu 
270 

Phe Lys Thr Gly Ala 
285 

Leu Lys Tyr Pro Tyr 
300 

Gly Ala Arg His lie 
320 

Glu Val lie Arg Leu 
335 

Asp Pro Lys Leu Tyr 
350 

Gly Asn Leu lie Leu 
365 

His Ala Thr Leu Asn 
380 

Lys Glu Glu 



<210> 7 
<211> 1131 
<212> DNA 

<213> Arabidopsiis sp. 



<400> 7 

atgagcagta 

aacatcgaag 

ctgcgtgatt 

gactcgttca 

ttattcccac 

tgcttcactt 

ttgctgaaag 

tgcagctttt 

atccgtccta 

gagcagatga 

caaagcacaa 

cgtgaaattg 

ctcatatttc 

gcttttgaat 

gacgccttct 

tcatgggctg 

acaggaattg 

1020 

aaggtccctt 
1080 

aagcaacaga 
1131 



cggcagggag 
attaccttcc 
tgctagacat 
caagatgttt 
tatactgctt 
tagcttttgg 
gtcaagatag 
ttgtcgcctc 
agcaggtcta 
ccgcatttgc 
tattagagag 
tagcaaaaaa 
ccgaagggac 
tggactgcac 
ggaatagcag 
ttgtatgtga 
aatttgcaga 



gctcgtgact 
ttctggttct 
ctctccaacg 
caaatcaaat 
tggggttgtt 
gtggattatt 
gttgaggaaa 
atggaccgga 
tgttgccaac 
tgttataatg 
tgtgggatgt 
gttaagggac 
atgtgtaaat 
tgtttgtcca 
aaaacaatca 
agtgtggtac 
gagggtcaga 



tcaaaatccg 
tccatca^tg 
ctcactgaag 
cctccagaac 
gttagatact 
ttcctttcat 
aagatagaga 
gttgtcaaat 
catacttcaa 
cagaagcatc 
atctggttca 
catgtccaag 
aataattaca 
attgcaatta 
tttactatgc 
ttggaaccac 
gacatgatat 



agcttgacct 
aacctcgcgg 
ctgctggtgc 
cttggaactg 
gtatcctctt 
tgtttatccc 
gggtcttggt 
atcacgggcc 
tgattgattt 
ctggttgggt 
atcgttcaga 
gagctgacag 
cagtgatgtt 
aatacaacaa 
acttgctgca 
aaaccataag 
ctcttcgggc 



cgatcaccct 
caagctcagc 
cattgttgat 
gaatatttac 
tcccttgagg 
tgtaaatgcg 
ggaaatgatt 
acgtcctagc 
catcgtattg 
tggtcttctg 
ggcaaaggat 
taatcctctt 
taagaagggt 
gatttttgtt 
actcatgaca 
gcccggtgaa 
gggtctcaaa 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



gggatggata cttgaagtat tcgagaccaa 
gtttcgcaga gtcgatcctg gctagattgg 



gccccaagca tagtgaacgc 
aagagaagtg a 



<210> 8 
<211> 376 
<212> PRT 

<213> Arabidopsis sp . 
<400> 8 

Met Ser Ser Thr Ala Gly Arg Leu Val Thr Ser 
15 10 



Lys Ser Glu Leu Asp 
15 



Leu Asp His Pro Asn lie Glu Asp Tyr Leu Pro Ser Gly Ser Ser He 
20 25 30 



wo 00/18889 ^ PCT/US99y22231 

Asn Glu Pro Arg Gly Lys Leu Ser Leu Arg Asp Leu Leu Asp lie Ser 
35 40 45 

Pro Thr Leu Thr Glu Ala Ala Gly Ala He Val Asp Asp Ser Phe Thr 
50 55 60 

Arg Cys Phe Lys Ser Asn Pro Pro Glu Pro Trp Asn Trp Asn He Tyr 
65 70 75 80 

Leu Phe Pro Leu Tyr Cys Phe Gly Val Val Val Arg Tyr Cys He Leu 
85 90 95 

Phe Pro Leu Arg Cys Phe Thr Leu Ala Phe Gly Trp He He Phe Leu 
100 105 110 

Ser Leu Phe He Pro Val Asn Ala Leu Leu Lys Gly Gin Asp Arg Leu 
115 120 125 

Arg Lys Lys He Glu Arg Val Leu Val Glu Met He Cys Ser Phe Phe 
130 135 140 

Val Ala Ser Trp Thr Gly Val Val Lys Tyr His Gly Pro Arg Pro Ser 
145 150 155 160 

He Arg Pro Lys Gin Val Tyr Val Ala Asn His Thr Ser Met He Asp 
165 170 175 

Phe He Val Leu Glu Gin Met Thr Ala Phe Ala Val He Met Gin Lys 
180 185 190 

His Pro Gly Trp Val Gly Leu Leu Gin Ser Thr He Leu Glu Ser Val 
195 200 205 

Gly Cys He Trp Phe Asn Arg Ser Glu Ala Lys Asp Arg Glu He Val 

210 215 220 

Ala Lys Lys Leu Arg Asp His Val Gin Gly Ala Asp Ser Asn Pro Leu 
225 230 235 240 

Leu He Phe Pro Glu Gly Thr Cys Val Asn Asn Asn Tyr Thr Val Met 
245 250 255 

Phe Lys Lys Gly Ala Phe Glu Leu Asp Cys Thr Val Cys Pro He Ala 
260 265 270 

He Lys Tyr Asn Lys He Phe Val Asp Ala Phe Trp Asn Ser Arg Lys 
275 280 285 

Gin Ser Phe Thr Met His Leu Leu Gin Leu Met Thr Ser Trp Ala Val 
290 295 300 

Val Cys Glu Val Trp Tyr Leu Glu Pro Gin Thr He Arg Pro Gly Glu 

305 310 315 320 

Thr Gly He Glu Phe Ala Glu Arg Val Arg Asp Met He Ser Leu Arg 
325 330 335 

Ala Gly Leu Lys Lys Val Pro Trp Asp Gly Tyr Leu Lys Tyr Ser Arg 
340 345 350 

Pro Ser Pro Lys His Ser Glu Arg Lys Gin Gin Ser Phe Ala Glu Ser 
355 360 365 

He Leu Ala Arg Leu Glu Glu Lys 
370 375 



<210> 9 
<211> 965 
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<212> DNA 

<213> Arabidopsis sp. 



<400> 9 

gttgttaagt 

tcgatcacag 

tgggatcatc 

tctaatggta 

gccatggctc 

cgacccattc 

aagaaagtgc 

aggagggaat 

tctatgtgta 

agagaccgag 

gatttaggtt 

tttcagatat 

tagtagtagg 

gatgtaaata 

taaat ttgta 

ctatggaatt 

aaaaa 



tacaagtctc 
ctcgattttc 
aaactngtcg 
ccgtcgtgat 
gtcaattcca 
tccgttcttg 
ggttcgcgga 
tgaaccggaa 
gaatctctac 
atcacagagt 
ttgtaaatct 
tgtagacttt 
tggttttctt 
attgacatgt 
aaaacatagt 
tatattgatt 



ttcaaaaaca 
ctttattgtt 
gtaaggwaac 
cgcaaccgcc 
tggaaatcat 
tctatcttca 
taatgtgaaa 
aagcgtaccg 
catgccagcg 
tcaatattct 
ttcttttgtt 
gtagttgggt 
atgctccact 
aagtagtcat 
gtgcctattg 
gtgttgaaaa 



cacacacacg 
ccgttggttt 
ttcacggacg 
atggtttgct 
caaaatccta 
gaggaaacga 
gatacgaaag 
aagccagtga 
aaccggatgg 
tattgacttt 
tttcggtaat 
ggtcttcttt 
tatctactta 
tagaaatttg 
tacatataaa 
aacaaaaaaa 



tctctcttca 
tcttgagnat 
gatcttcaat 
caagcaccgc 
aggttcttga 
agaaacaggg 
gtaacgggga 
ctaaaccggg 
ctctgtacaa 
ttcttcttga 
attagatttt 
ttctcccttt 
■cttgttttaa 
aaaaggcaaa 
ctctcttttg 
aaaaaaaaaa 



cagccaatca 
ttttctttct 
gttgagctgt 
tctgtttctc 
tcagactcta 
gaagaagata 
agagtaccgg 
aaagaccggt 
tgggattctt 
ttagtcaata 
tt'cttggaaa 
ttgtgtctca 
atcaagtgat 
tgaaagaata 
ttggggatat 
aaaaaaaaaa 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

965 



<210> 10 
<211> 1593 
<212> DNA 

<213> Arabidopsis sp . 



<400> 10 

atgtccggta 

attctccgtc 

ggcctccacc 

ctactcaaat 

gtgataaggt 

atgggcttga 

gtggggaaat 

gttttgaaaa 

gtat tcttgc 

ggtggttact 

gtggttcaag 

tcgccaagtc 

tcagacaaga 

cacgatggtc 

gccccattcg 

tccctagcca 

cacaacgacc 

1020 

acgttattgg 
1080 

acgt-atagtc 
1140 

cgtgatcgag 
1200 

gtttgtccgg 
1260 

tctgaggttt 
1320 

ggcacgacgg 
1380 

ccttcctaca 
1440 

ggagtccctg 
1500 

gggaatgcct 
1560 

gccggtaata 
1593 

<210> 11 
<211> 530 
<212> PRT 



ataagatctc 
gttggtgtca 
aatatcaaga 
caaactcttt 
cacttttcct 
agacgatggt 
cagttttgcc 
gaggaggcaa 
gagat tac t t 
acctaggcat 
aagaaagact 
acagatctct 
aaagttggca 
gtttagccgt 
ccgccgtctt 
atcccttcct 
taatatccgc 



gactcttcaa 
tcgtagccct 
cctatcgaat 
attcccttac 
cttagttctt 
gatgctgagc 
taagtatttt 
gagagttgct 
ggagatagaa 
cgtggaggat 
tggtagtggt 
cttctctcaa 
aaccctacca 
taagccaaca 
agccgctgca 
cgccttttcc 
cgacagaaaa 



gctcttgtct 
aaacaaaaaC 
cacactttga 
ttcatggttg 
tatccattta 
ttctttggag 
ctagaagatg 
gtgagtgatt 
gttgtggtcg 
aagaagaacc 
cgtcgtctta 
ttttgccagg 
caagatcaat 
cctttaaaca 
agactcgtct 
ggtatccacc 
agaggttgtc 



acccacttta catttcatac gctctaagaa 



taagcagatt 
tcaaagatgg 
aagggactac 
gtgacgtcat 
ctagtggtct 
ccgtcaaatt 
acaatggaaa 
tggggtttga 
acggagttgt 



atctgagctt 
tcaagccatg 
gtgtagagag 
cgtacctgtt 
taaggcattt 
gcttgaccct 
agttaacttc 
gtgcaccaac 
caagaaaaaa 



ctggctccga 
gagaaatCgc 
ccttacttgc 
gctattgact 
gatcccattt 
gtctctggaa 
gaggtggcta 
ct-cacgagaa 
taa 



tcttcttgta 
accaaaaatg 
tattcaacgt 
tggcattcga 
taagcttgat 
ttaaaaagga 
ttgggctcga 
taccacaagt 
gaagagacat 
ttgaaattgc 
ttggcatcac 
aaatttactt 
accctaaacc 
cactcgtatt 
tcggcctaaa 
ttactctcac 
tctttgtgtg 

agaaaaacat 

tcaagaccgt 

tgagccaggg 

ttcggtttag 

cacacgtgac 

tcttcctttt 

gtagctcgtc 

atcacgtgca 

gagatiaagta 



ccggtttttc 
cccttctcac 
cgaaggagct 
agccggaggg 
gagctacgaa 
aagcttccga 
gatgttccag 
tatgattgat 
gaaaatggtc 
ttttgataaa 
ttcctttaac 
cgtcagaaat 
attgattttc 
attcatgtgg 
cttaccttac 
cg^caacaac 
taaccataga 

gaaagccgtg 

tagattigact 

agatctcgtg 

tccacttttc 

tttcttctat 

gaatcctttc 

cacgtgtcga 

gcatigagatic 

cttgatcttg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 
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<213> Arabidopsis sp. 
<400> 11 

Met Ser Gly Asn Lys He Ser Thr Leu Gin Ala Leu Val Phe Phe Leu 
1 5 10 . 15 

Tyr Arg Phe Phe He Leu Arg Arg Trp Cys His Arg Ser Pro Lys Gin 
20 25 30 

Lys Tyr Gin Lys Cys Pro Ser His Gly Leu His Gin Tyr Gin Asp Leu 
35 40 45 

Ser Asn His Thr Leu He Phe Asn Val Glu Gly Ala Leu Leu Lys Ser 
50 55 60 

Asn Ser Leu Phe Pro Tyr Phe Met Val Val Ala Phe Glu Ala Gly Gly 
65 70 75 80 

Val He Arg Ser Leu Phe Leu Leu Val Leu Tyr Pro Phe He Ser Leu 
85 90 95 

Met Ser Tyr Glu Met Gly Leu Lys Thr Met Val Met Leu Ser Phe Phe 
100 105 110 

Gly Val Lys Lys Glu Ser Phe Arg Val Gly Lys Ser Val Leu Pro Lys 
115 120 125 

Tyr Phe Leu Glu Asp Val Gly Leu Glu Met Phe Gin Val Leu Lys Arg 
130 135 140 

Gly Gly Lys Arg Val Ala Val Ser Asp Leu* Pro Gin Val Met He Asp 
145 150 155 160 

Val Phe Leu Arg Asp Tyr Leu Glu He Glu Val Val Val Gly Arg Asp 
165 170 175 

Met Lys Met Val Gly Gly Tyr Tyr Leu Gly He Val Glu Asp Lys Lys 
180 185 190 

Asn Leu Glu He Ala Phe Asp Lys Val Val Gin Glu Glu Arg Leu Gly 
195 200 205 

Ser Gly Arg Arg Leu He Gly He Thr Ser Phe Asn Ser Pro Ser His 

210 215 220 

Arg Ser Leu Phe Ser Gin Phe Cys Gin Glu He Tyr Phe Val Arg Asn 
225 230 235 240 

Ser Asp Lys Lys Ser Trp Gin Thr Leu Pro Gin Asp Gin Tyr Pro Lys 
245 250 255 

Pro Leu He Phe His Asp Gly Arg Leu Ala Val Lys Pro Thr Pro Leu 
260 265 270 

Asn Thr Leu Val Leu Phe Met Trp Ala Pro Phe Ala Ala Val Leu Ala 
275 280 285 

Ala Ala Arg Leu Val Phe Gly Leu Asn Leu Pro Tyr Ser Leu Ala Asn 
290 295 300 

Pro Phe Leu Ala Phe Ser Gly He His Leu Thr Leu Thr Val Asn Asn 
305 310 315 320 

His Asn Asp Leu He Ser Ala Asp Arg Lys Arg Gly Cys Leu Phe Val 
325 330 335 

Cys Asn His Arg Thr Leu Leu Asp Pro Leu Tyr He Ser Tyr Ala Leu 
340 345 350 

Arg Lys Lys Asn Met Lys Ala Val Thr Tyr Ser Leu Ser Arg Leu Ser 
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Glu Leu 
370 

Lys Asp 
385 

Val Cys 

Ser Pro 

Asp Ser 

Ala Phe 
450 

Val Lys 
465 

Gly Val 
Gin His 
Arg Arg 



355 

Leu Ala Pro 

Gly Gin Ala 

Pro Glu Gly 
405 

Leu Phe Ser 
420 

His Val Thr 
435 

Asp Pro lie 
Leu Leu Asp 



Pro Asp Asn 
485 

Glu lie Gly 
500 

Asp Lys Tyr 
515 



lie Lys 
375 

Met Glu 
390 

Thr Thr 

Glu Val 

Phe Phe 

Phe Phe 
455 

Pro Val 
470 

Gly Lys 
Asn Ala 
Leu lie 



360 

Thr Val Arg 

Lys Leu Leu 

Cys Arg Glu 
410 

Cys Asp Val. 
425 

Tyr Gly Thr 
440 

Leu Leu Asn 
Ser Gly Ser 



Val Asn Phe 
490 

Leu Gly Phe 
505 

Leu Ala Gly. 
520 



Leu Thr 
380 

Ser Gin 
395 

Pro Tyr 

lie Val 

Thr Ala 

Pro Phe 
460 

Ser Ser 
475 

Glu Val 
Glu Cys 
Asn Asn 



365 

Arg Asp Arg Val 



Gly Asp Leu Val 
400 

Leu Leu Arg Phe 
415 

Pro Val Ala lie 
430 

Ser Gly Leu Lys 
445 

Pro Ser Tyr Thr 



Ser Thr Cys Arg 
480 

Ala Asn His Val 
495 

Thr Asn Leu Thr 
510 

Gly Val Val Lys 
525 



Lys Lys 
530 



<210> 12 
<211> 1509 
<212> DNA 

<213> Arabidopsis sp. 



<400> 12 

atggttatgg 

atactgaaga 

ctaattcgtt 

agctacaaaa 

ccggagatcg 

atggacacgt: 

cgagttatgg 

gaactgattg 

cagtctgctt 

ggaaaaccgg 

gcaccaatcc 

gtgatatttc 

ctcctttgga 

ctcccattgt 

ggaaagcctc 

agaaccctaa 

acttactcaa 

1020 

agaatccgag 
1080 

gtttgtcctg 
1140 

gctgagttaa 
1200 

gcgactacag 
1260 

ccggtttacg 
1320 



agcaagctgg 
acgcagattc 
tcgctatctt 
acgcagctct 
aatcagtggc 
ggagggtttt 
tggagaggtt 
taaaccggtt 
tgaaccgtgt 
ctttgaccgc 
cggagaacta 
acgacggaag 
tcccatttgg 
gggccacacc 
ctcagccacc 
tggaccctgt 
tctcgcgctt 

atgtggatgc 

agggaaccac 

cggataggat 

cgagaggctg 

agattacgtt 



aacgacatcg 
attctcttac 
gttgtttcta 
caagctcaag 
tagagccgtt 
cagctcgtgt 
tgctaaggag 
cggttttgtc 
cgctaatttg 
ctctacaaat 
caaccacggt 
actagtgaag 
aatcattcCc 
ttacgtctct 
ggcggctgga 
ggtattatct 
atcagagat:c 



tattcggtcg 
ttcatgctcg 
tggcccgtaa 
atttttgtag 
ctgccaaaat 
aagaagaggg 
catcttagag 
accggtttga 
tttgttggtc 
ttcttatcgt 
gaccaacaac 
cggccaacgc 
gccgtgatcc 
cagatattcg 
aaatccggcg 
tatgtcctcg 
ttatctccca 



tgtcagagtt 
tagccttcga 
tcacactcct 
ccactgttgg 
tctacatgga 
tcgtggtcac 
cagatgaggt 
ttcgcgaaac 
ggaggcctca 
tatgtgagga 
ttcagctacg 
cggccaccgc 
ggatctttct 
gtggccatat 
tgctctttgt 
gacgtagcat 
ttccaaccgt 



tigaaggaaca 
agcagctggt 
tgacgttttc 
tctacgtgaa 
cgacgtaagc 
gagaatgcct 
catcggtacg 
ggatgttgat 
actaggtctt 
gcatattcat 
tccacttccg 
tctcatcatc 
tggagccgtc 
catcgtcaaa 
gtgtactcac 
cccagccgtt 
ccgattgaca 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



ggctaagatc aaacaacaac tgtcaaaagg agatctagtg 
ttgtcgtgaa ccgtttttgt taagattcag cgcgcttttc 
tgttccggtt gcgatgaact acagagtcgg attcttccac 
gaagggtttg gacccaattt tcttcttcat gaacccaaga 
cttgaaccag cttcctatgg aggcaacatg ttcgtccggg 
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aagagcccgc atgacgtggc gaactatgtt cagagaatct tggcggctac gttagggttt 

1380 

gagtgcacca acttcacaag aaaagataag tatagggttc tcgctggaaa cgatggaaca 
1440 

gtgtcgtact tgtcgttgct agaccaattg aagaaggtgg ttagcacttt cgagccttgt 
1500 

ctccattga 
1509 

<210> 13 
<211> 502 
<212> PRT 

<213> Arabidopsis sp. 
<400> 13 

Met Val Met Glu Gin Ala Gly Thr Thr Ser Tyr Ser Val Val Ser Glu 
15 10 15 

Phe Glu Gly Thr lie Leu Lys Asn Ala Asp Ser Phe Ser Tyr Phe Met 
20 25 30 

Leu Val Ala Phe Glu Ala Ala Gly Leu lie Arg Phe Ala lie Leu Leu 
35 40 45 

Phe Leu Trp Pro Val lie Thr Leu Leu Asp Val Phe Ser Tyr Lys Asn 

50 55 60 

Ala Ala Leu Lys Leu Lys lie Phe Val Ala Thr Val Gly Leu Arg Glu 
65 70 75 80 

Pro Glu lie Glu Ser Val Ala Arg Ala Val Leu Pro Lys Phe Tyr Met 

85 90 95 

Asp Asp Val Ser Met Asp Thr Trp Arg Val Phe Ser Ser Cys Lys Lys 
100 105 110 

Arg Val Val Val Thr Arg Met Pro Arg Val Met Val Glu Arg Phe Ala 

115 120 125 

Lys Glu His Leu Arg Ala Asp Glu Val lie Gly Thr Glu Leu lie Val 
130 135 140 

Asn Arg Phe Gly Phe Val Thr Gly Leu lie Arg Glu Thr Asp Val Asp 
145 150 155 160 

Gin Ser Ala Leu Asn Arg Val Ala Asn Leu Phe Val Gly Arg Arg Pro 
165 170 175 

Gin Leu Gly Leu Gly Lys Pro Ala Leu Thr Ala Ser Thr Asn Phe Leu 
180 185 190 

Ser Leu Cys Glu Glu His lie His Ala Pro lie Pro Glu Asn Tyr Asn 
195 200 205 

His Gly Asp Gin Gin Leu Gin Leu Arg Pro Leu Pro Val He Phe His 

210 215 220 

Asp Gly Arg Leu Val Lys Arg Pro Thr Pro Ala Thr Ala Leu He He 
225 230 235 240 

Leu Leu Trp He Pro Phe Gly He He Leu Ala Val He Arg He Phe 
245 250 255 

Leu Gly Ala Val Leu Pro Leu Trp Ala Thr Pro Tyr Val Ser Gin He 
260 265 270 

Phe Gly Gly His He He Val Lys Gly Lys Pro Pro Gin Pro Pro Ala 
275 280 285 

Ala Gly Lys Ser Gly Val Leu Phe Val Cys Thr His Arg Thr Leu Met 
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290 



Asp Pro Val Val 
305 

Thr Tyr Ser lie 



Val Arg Leu Thr 
340 



Leu Ser 
310 

Ser Arg 
325 



295 
Tyr 

Leu 



Arg lie Arg 



12 



Val Leu Gly Arg 
315 

Ser Glu lie Leu 
330. 

Asp Val Asp Ala 
345 
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300 

Ser lie 
Ser Pro 
Ala Lys 



Pro Ala Val 
320 

lie Pro Thr 
335 

lie Lys Gin. 
350 



Gin Leu Ser Lys 
355 



Gly Asp Leu 



Arg Glu Pro Phe Leu Leu 
370 



Arg 
375 



Asp Arg lie Val 
385 

Ala Thr Thr Ala 



Met Asn Pro Arg 
420 

Met Glu Ala Thr 
435 

Tyr Val Gin Arg 
450 

Phe Thr Arg Lys 
465 

Val Ser Tyr Leu. 



Pro Val 
390 



Ala 
Trp 
Pro Val Tyr 



Arg Gly 
405 



Val Val Cys Pro 
360 

Phe Ser Ala Leu 



Met Asn Tyr Arg 
395 

Lys Gly Leu Asp 
410 



Glu Gly 
365 

Phe Ala 
380 

Val Gly 
Pro lie 



Glu lie Thr Phe Leu Asn 
425 



Cys Ser 
lie Leu 



Asp Lys 
470 

•Ser Leu 
485 



Ser 



Ala 
455 

Tyr 



Leu 



Gly Lys Ser Pro 
440 

Ala Thr Leu Gly 



Arg Val Leu Ala 
475. 

Asp Gin Leu Lys 
490 



His Asp 
445 

Phe Glu 
460 

Gly Asn 
Lys Val 



Thr Thr Cys 

Glu Leu Thr 

Phe Phe His 
400 

Phe Phe Phe 
415 

Gin Leu Pro 
430 

Val Ala Asn 
Cys Thr Asn 



Asp Gly Thr 
480 

Val Ser Thr 
495 



Phe Glu Pro Cys Leu His 
500 



<210> 14 
<211> 1563 
<212> DNA 

<213> Arabidopsis sp. 



<400> 14 

atgtccgcca 

cggcgatatc 

gacctatcac 

ctcttccctt 

ctcttcattc 

gtaatggtga 

cctaaatact 

aagaaaatcg 

tacttggaga 

ggtatcatgg 

agactaaaca 

ctattctctc 

caaaccctac 

atcaaaccaa 

gccgcagcag 

ctcgcctttt 

aaaccaagtc 

1020 

ctctatgttg 
1080 

agggtatctg 
1140 



agatttcaat 
ggaactctaa 
gccacacatt 
acttcatgtt 
tctatccatt 
gcttcttcgg 
ttctagaaga 
gagtgagtga 
ttgacgttgt 
aggataaaac 
ccggtcgtgt 
agttttgcca 
cacgaagcca 
ccctaatgaa 
ccagactctt 
ccggttgcag 
aacgcaaagg 



attccaagct 
accaaaatac 
gatcttcaac 
agtagcattt 
gataagcttg 
gatcaaaaaa 
tgtcggactc 
tgatcttcct 
ggtcgggaga 
caaacatgat 
tattggcatc 
ggaaatttat 
gtaccctaaa 
cactttggtc 
cgtctctctt 
actaaccgtc 
ttgtctcttt 



cttgtctttc 
caaaatggcc 
gtagaaggag 
gaggcgggag 
atgagccatg 
gaaggttttc 
gagatcttcg 
caagttatga 
gaaatgaaag 
cttgtctttg 
acttccttca 
ttcgtgaaga 
ccattgattt 
ttgttcatgt 
tgcatccctt 
actaacgact 
gtatgtaacc 



tattctaccg 
cttcttctct 
ctcttctcaa 
gcgtaataag 
agatgggtgt 
gagcggggag 
aagtgttgaa 
tcgaagggtt 
tcgttggagg 
atgagttagt 
atacatct-ct 
aatcagacaa 
tccatgatgg 
ggggtccttt 
actctttatc 
acgtttcatc 
ataggacttt 



gtttatcctc 
cctccaatcc 
atccgactct 
gtcatttctc 
caaagtgatg 
agcggttttg 
gagaggaggg 
cttgagagat 
ttattatcta 
tcgtaaagag 
tcaccgatat 
gcgaagctgg 
ccgtctcgcg 
cgcagccgca 
aa tcccgatc 
tcaaaaacaa 
attggaccct 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



cattcgcttt gagaaagaaa 
agattttggc tccgatcaag 



aacatcaaaa ctgtaacgta tagtttgagt 
acggtgagac tgacccgtga tcgggtgagc 



wo 00/18889 PCTAJS99/22231 

13 

gacggtcaag ccatggagaa attgttaacc gaaggagatc tcgttgtttg tcctgaagga 
1200 

accacttgta gagaacctta cctgcttagg tttagccctt tgttcaccga ggttagtgat 
1260 

gtcatcgttc ccgtggctgt gacggtacac gtgaccttct tctacggtac aacggcgagt 
1320 

ggtcttaagg cacttgaccc gcttttcttc ctcttggatc cttatcctac ctacaccatc 
1380 

caatttctcg accctgtctc cggtgccacg tgccaagatc ctgatggaaa gttgaagttt 
1440 

gaggtggcca acaatgttca gagtgatatt gggaaggcgc tggatttcga gtgcacaagt 
1500 

ctcactagaa aagacaagta tttgatcttg gccggtaata atggagtagt taagaaaaat 
1560 

taa • 
1563 

<210> 15 
<21I> 520 
<212> PRT 

<213> Arabidopsis sp. 
<400> 15 

Met Ser Ala Lys He Ser He Phe Gin Ala Leu Val Phe Leu Phe Tyr 
15 10 15 

Arg Phe He Leu Arg Arg Tyr Arg Asn Ser Lys Pro Lys Tyr Gin Asn 
20 25 30 

Gly Pro Ser Ser Leu Leu Gin Ser Asp Leu Ser Arg His Thr Leu He 
35 40 45 

Phe Asn Val Glu Gly Ala Leu Leu Lys Ser Asp Ser Leu Phe Pro Tyr 

50 55 60 

Phe Met Leu Val Ala Phe Glu Ala Gly Gly Val He Arg Ser Phe Leu 
65 70 75 80 

Leu Phe He Leu Tyr Pro Leu He Ser Leu Met Ser His Glu Met Gly 
85 90 95 

Val Lys Val Met Val Met Val Ser Phe Phe Gly He Lys Lys Glu Gly 
100 105 110 

Phe Arg Ala Gly Arg Ala Val Leu Pro Lys Tyr Phe Leu Glu Asp Val 

115 120 125 

Gly Leu Glu He Phe Glu Val Leu Lys Arg Gly Gly Lys Lys He Gly 
130 135 140 

Val Ser Asp Asp Leu Pro Gin Val Met He Glu Gly Phe Leu Arg Asp 
145 150 155 160 

Tyr Leu Glu He Asp Val Val Val Gly Arg Glu Met Lys Val Val Gly 
165 170 175 

Gly Tyr Tyr Leu Gly He Met Glu Asp Lys Thr Lys His Asp Leu Val 
180 185 190 

Phe Asp Glu Leu Val Arg Lys Glu Arg Leu Asn Thr Gly Arg Val He 
195 200 205 

Gly He Thr Ser Phe Asn Thr Ser Leu His Arg Tyr Leu Phe Ser Gin 

210 215 220 

Phe Cys Gin Glu He Tyr Phe Val Lys Lys Ser Asp Lys Arg Ser Trp 
225 230 235 240 

Gin Thr Leu Pro Arg Ser Gin Tyr Pro Lys Pro Leu He Phe His Asp 
245 '^50 255 
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Gly Arg Leu Ala lie Lys Pro Thr Leu Met Asn Thr Leu Val Leu Phe 
260 265 270 

Met Trp Gly Pro Phe Ala Ala Ala .Ala Ala Ala Ala Arg Leu Phe Val 
275 280 285 

Ser Leu Cys lie Pro Tyr Ser Leu Ser lie Pro lie Leu Ala Phe Sea; • 
290 295 300 

Gly Cys Arg Leu Thr Val Thr Asn Asp Tyr Val Ser Ser Gin Lys Gin 
305 310 315 320 

Lys Pro Ser Gin Arg Lys Gly Cys Leu Phe Val Cys Asn His Arg Thr 
325 330 335 

Leu Leu Asp Pro Leu Tyr Val Ala Phe Ala Leu Arg Lys Lys Asn lie 
340 345 350 

Lys Thr Val Thr Tyr Ser Leu Ser Arg Val Ser Glu lie Leu Ala Pro 
355 360 365 

lie Lys Thr Val Arg Leu Thr Arg Asp Arg Val Ser Asp Gly Gin Ala 
370 375 380 

Met Glu Lys Leu Leu Thr Glu Gly Asp Leu Val Val Cys Pro Glu Gly 
385 390 395 400 

Thr Thr Cys Arg Glu Pro Tyr Leu Leu Arg Phe Ser Pro Leu Phe Thr 
405 410 415 

Glu Val Ser Asp Val lie Val Pro Val Ala Val Thr Val His Val Thr 
420 425 430 

Phe Phe Tyr Gly Thr Thr Ala Ser Gly Leu Lys Ala Leu Asp Pro Leu 
435 440 445 

Phe Phe Leu Leu Asp Pro Tyr Pro Thr Tyr Thr lie Gin Phe Leu Asp 
450 455 460 

Pro Val Ser Gly Ala Thr Cys Gin Asp Pro Asp Gly Lys Leu Lys Phe 
465 470 475 480 

Glu Val Ala Asn Asn Val Gin Ser Asp lie Gly Lys Ala Leu Asp Phe 
485 490 495 

Glu Cys Thr Ser Leu Thr Arg Lys Asp Lys Tyr Leu lie Leu Ala Gly 
500 505 510 

Asn Asn Gly Val Val Lys Lys Asn 
515 520 



<210> 16 
<211> 1506 
<212> DNA 

<213> Arabidopsis sp. 



<400> 16 

atgggagctc 

cggtccaacc 

ttcccttact 

cttgtgtccg 

aacgtatttg 

cgttccgtcc 

aacacgttcg 

gtgaaaacat 

ggtcgggcaa 

gtcgttttga 

agcaagacgg 



aggagaaacg 
ataccgtggc 
atttcctcgt 
taccattcgt 
tcttcatcac 
tcccgaggtt 
ggaaacggta 
tcctaggagt 
ccgggttcac 
gagagtttgg 
accacgactt 



gcgccgtttc 
cgctgatcta 
agccctcgag 
ttatcttacg 
gttcgcgggt 
ctatgcggag 
cataataact 
tgataaagtt 
cagaaaacca 
tggcctagcg 
catgtccatc 



gagcagatat 
gacggaacac 
gcagggagct 
tacttgacca 
ctcaagatcc 
gacgtgaggc 
gcgagcoctc 
cttggaacag 
ggtattctcg 
tctgatttac 
tgcaaggaag 



caaagtgcga 
tactaatctc 
tgctccgagc 
tctccgagac 
gagacgttga 
ccgatacctg 
gaattatggt 
agctagaggt 
tcggtcagta 
ctgatttggg 
gttacatggt 



tgttaaggac 
tcgtagcgcc 
gttgatccta 
tttagccatc 
gctagtggtc 
gcgtatcttc 
cgagccattc 
ctccaaatcg 
caaacgtgac 
gctcggcgat 
gccacgtacg 



60 

120 

180 

240 

300 

360 

420 

4 80 

540 

600 

660 
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aaatgcgaac cattaccaag aaacaaactc ttaagcccca taatattcca cgagggcaga 720 
ttagtccaac gcccaacgcc gttagttgct ctgttaactt tcctctggct tcccgtcggt 780 
ttcgtcctct ctatcatccg cgtctacacg aatattccgt taccggaacg tatcgcccgt 840 
tacaactaca agcttactgg catcaagcta gtcgtcaacg gccaccctcc tccgccgcca 900 
aaacctggcc agccaggcca tcttttggtc tgcaaccacc gcaccgttct cgatcctgtg 960 
gtcacagctg tcgcactcgg ccggaaaatc agctgcgtca cttacagcat cagcaagttc 
1020 

tctgagctaa tctcaccaat caaagccgtt gcgttgactc gtcaacgtga gaaagacgca 
1080 

gcgaacatca agcgtctttt ggaggaaggc gatctcgtga tatgtcccga gggaaccacg 
1140 

tgccgtgagc ctttccttct ccggtttagt gctcttttcg ctgagctcac ggaccggatc 
1200 

gttcccgtgg cgatcaacac aaagcagagc atgttcaatg gtaccaccac acgtggatac 
1260 

aagcttcttg atccttactt tgcgttcatg aacccgaggc cgacgtatga gatcacgttc 
1320 

ctcaaacaga ttccagctga gctgacgtgt aaaggaggca aatctccgat agaggttgcg 
1380 

aattacatac agagggtttt gggaggaacc ttaggttttg agtgcaccaa tttcacaaga 
1440 

aaggataagt acgcaatgct tgctggtact gacggtaggg ttccggtgaa gaaggagaag 

1500 

acgtga 

1506 

<210> 17 
<211> 500 
<212> PRT 

<213> Arabidopsis sp . 
<400> 17 

Met Gly Ala Gin Glu Lys Arg Arg Arg Phe Glu Gin lie Ser Lys Cys 
15 10 15 

Asp Val Lys Asp Arg Ser Asn His Thr Val Ala Ala Asp Leu Asp Gly 
20 25 30 

Thr Leu Leu lie Ser Arg Ser Ala Phe Pro Tyr Tyr Phe Leu Val Ala 
35 40 45 

Leu Glu Ala Gly Ser Leu Leu Arg Ala Leu lie Leu Leu Val Ser Val 
50 55 60 

Pro Phe Val Tyr Leu Thr Tyr Leu Thr lie Ser Glu Thr Leu Ala lie 
65 70 75 80 

Asn Val Phe Val Phe He Thr Phe Ala Gly Leu Lys He Arg Asp Val 
85 90 95 

Glu Leu Val Val Arg Ser Val Leu Pro Arg Phe Tyr Ala Glu Asp Val 

100 105 110 

Arg Pro Asp Thr Trp Arg He Phe Asn Thr Phe Gly Lys Arg Tyr He 
115 120 125 

He Thr Ala Ser Pro Arg He Met Val Glu Pro Phe Val Lys Thr Phe 

130 135 140 

Leu Gly Val Asp Lys Val Leu Gly Thr Glu Leu Glu Val Ser Lys Ser 
145 150 155 160 

Gly Arg Ala Thr Gly Phe Thr Arg Lys Pro Gly He Leu Val Gly Gin 
165 170 175 

Tyr Lys Arg Asp Val Val Leu Arg Glu Phe Gly Gly Leu Ala Ser Asp 
180 185 190 

Leu Pro Asp Leu Gly Leu Gly Asp Ser Lys Thr Asp His Asp Phe Met 
195 200 205 



wo 00/18889 



16 



PCTAJS99/22231 



Ser lie Cys Lys* Glu Gly Tyr Met Val Pro Arg Thr Lys Cys Glu Pro 
210 215 220 

Leu Pro Arg Asn Lys Leu Leu Ser Pro lie lie Phe His Glu Gly Arg 
225 230 235 240 

Leu Val Gin Arg Pro Thr Pro Leu Val Ala Leu Leu Thr Phe Leu Trp 
245 250 255 

Leu Pro Val Gly Phe Val Leu Ser lie lie Arg Val Tyr Thr Asn lie 
260 265 270 

Pro Leu Pro Glu. Arg lie Ala Arg Tyr Asn Tyr Lys Leu Thr Gly lie 
275 280 285 

Lys Leu Val Val Asn Gly His Pro Pro Pro Pro Pro Lys Pro Gly Gin 
290 295 300 

Pro Gly His Leu Leu Val Cys Asn His Arg Thr Val Leu Asp Pro Val 
305 310 315 320 

Val Thr Ala Val Ala Leu Gly Arg Lys lie Ser Cys Val Thr Tyr Ser 
325 330 335 

lie Ser Lys Phe Ser Glu Leu lie Ser Pro lie Lys Ala Val Ala Leu 
340 345 350 

Thr Arg Gin Arg Glu Lys Asp Ala Ala Asn lie Lys Arg Leu Leu Glu 
355 360 365 

Glu Gly Asp Leu Val He Cys Pro Glu Gly Thr Thr Cys Arg Glu Pro 
370 375 380 

Phe Leu Leu Arg Phe Ser Ala Leu Phe Ala Glu Leu Thr Asp Arg He 
385 390 395 400 

Val Pro Val Ala He Asn Thr Lys Gin Ser Met Phe Asn Gly Thr Thr 
405 410 ■ 415 

Thr Arg Gly Tyr Lys Leu Leu Asp Pro Tyr Phe Ala Phe Met Asn Pro 
420 425 430 

Arg Pro Thr Tyr Glu He Thr Phe Leu Lys Gin He Pro Ala Glu Leu 
435 440 445 

Thr Cys Lys Gly Gly Lys Ser Pro He Glu Val Ala Asn Tyr He Gin 
450 455 460 

Arg Val Leu Gly Gly Thr Leu Gly Phe Glu Cys Thr Asn Phe Thr Arg 
465 470 475 480 

Lys Asp Lys Tyr Ala Met Leu Ala Gly Thr Asp Gly Arg Val Pro Val 
485 490 495 

Lys Lys Glu Lys 
500 



<210> 18 
<211> 1620 
<212> DNA 

<213> Arabidopsis sp . 
<400> 18 

atggcggatc ctgatctgtc ttctcctttg 
gttgttatct ctatcgccga cgacgacgac 
gttgttgacc ctcgtgtttc acgaggtttt 
ctcagcgagt cagagcctcc ggttctcggt 
acacctggag ttagcggatt gtacgaagcg 



atccaccatc aatcctccga tcaacctgaa 60 
gacgagtcag gactcaatct tcttccagcc 120 
gagtttgacc atcttaatcc ttatggcttt 180 
ccgacgacgg tggatccatt ccggaacaat 2 40 
attaagctcg tgatttgtct tccgattgct 300 
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ctgattagac ttgttctctt tgctgctagc ttagctgttg gttacttggc tacaaaattg 360 
gcacttgctg gctggaaaga taaagagaac cctatgcctc tttggagatg cagaatcatg 420 
tggattactc ggatctgtac cagatgtatc ctcttctctt ttggctatca gtggataaga 480 
aggaaaggga aacctgctcg gagagagatt gctccgattg ttgtatcaaa tcatgtttct 540 
tatattgaac caatcttcta cttctatgaa ttatcaccga ccattgttgc atcggagtca 600 
catgattcac ttccatttgt tggaactatt atcagggcaa tgcaggtgat atatgtgaat 660 
agattctcac agacatcaag gaagaatgct gtgcatgaaa taaagagaaa agcttcctgc 720 
gatagatttc ctcgtctgct gttattcccc gaaggaacca cgactaatgg gaaagttctt 780 
atttccttcc aactcggtgc tttcatccct ggttacccta ttcaacctgt agtagtccgg 840 
tatccccatg tacattttga tcaatcctgg ggaaatatct ctttgttgac gctcatgttt 900 
agaatgttca ctcagtttca caatttcatg gaggttgaat atcttcctgt aatctatccc 960 
agtgaaaagc aaaagcagaa tgctgtgcgt ctctcacaga agactagtca tgcaattgca 
1020 

acatctttga atgtcgtcca aacatcccat tcttttgcgg acttgatgct actcaacaaa 
1080 

gcaactgagt taaagctgga gaacccctca aattacatgg ttgaaatggc aagagttgag 
1140 

tcgctattcc atgtaagcag cttagaggca acgcgatttt tggatacatt: tgtttccatg 
1200 

attccggact cgagtggacg tgttaggcta catgactttc ttcggggtct taaactgaaa 
1260 

ccttgccctc tttctaaaag gatatttgag ttcatcgatg tggagaaggt cggatcaatc 
1320 

actttcaaac agttcttgtt tgcctcgggc cacgtgttga cacagccgct ttttaagcaa 
1380 

acatgcgagc tagccttttc ccattgcgat gcagatggag atggctatat tacaattcaa 
1440 

gaactcggag aagctctcaa aaacacaatc ccaaacttga acaaggacga gattcgagga 
1500 

atgtaccatt tgctagacga cgaccaagat caaagaatca gccaaaatga cttgttgtcc 
1560 

tgcttaagaa gaaaccctct tctcatagcc atctttgcac ctgacttggc cccaacataa 
1620 

<210> 19 
<211> 539 
<212> PRT 

<213> Arabidopsis sp . 
<400> 19 

Met Ala Asp Pro Asp Leu Ser Ser Pro Leu lie His His Gin Ser Ser 
15 10 15 

Asp Gin Pro Glu Val Val lie Ser lie Ala Asp Asp Asp Asp Asp Glu 
20 25 30 

Ser Gly Leu Asn Leu Leu Pro Ala Val Val Asp Pro Arg Val Ser Arg 
35 40 45 

Gly Phe Glu Phe Asp His Leu Asn Pro Tyr Gly Phe Leu Ser Glu Ser 
50 55 60 

Glu Pro Pro Val Leu Gly Pro Thr Thr Val Asp Pro Phe Arg Asn Asn 
65 70 75 80 

Thr Pro Gly Val Ser Gly Leu Tyr Glu Ala lie Lys Leu Val lie Cys 
85 90 95 

Leu Pro lie Ala Leu lie Arg Leu Val Leu Phe Ala Ala Ser Leu Ala 
100 105 110 

Val Gly Tyr Leu Ala Thr Lys Leu Ala Leu Ala Gly Trp Lys Asp Lys 
115 120 125 

Glu Asn Pro Met Pro Leu Trp Arg Cys Arg lie Met Trp lie Thr Arg 
130 135 140 



lie Cys Thr Arg Cys lie Leu Phe Ser Phe Gly Tyr Gin Trp lie Arg 
145 150 155 160 
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Arg Lys Gly Lys Pro Ala Arg Arg Glu lie Ala Pro lie Val Val Ser 
165 170 175 

Asn His Val Ser Tyr lie Glu Pro lie Phe Tyr Phe Tyr Glu Leu Ser 
180 185 190 

Pro Thr lie Val Ala Ser Glu Ser His Asp Ser Leu Pro Phe Val Gly 
195 200 205 

Thr lie lie Arg Ala Met Gin Val lie Tyr Val Asn Arg Phe Ser Gin 
210 215 220 

Thr Ser Arg Lys Asn Ala Val His Glu lie Lys Arg Lys Ala Ser Cys 
225 230 235 240 

Asp Arg Phe Pro Arg Leu Leu Leu Phe Pro Glu Gly Thr Thr Thr Asn 
245 250 255 

Gly Lys Val Leu lie Ser Phe Gin Leu Gly Ala Phe lie Pro Gly Tyr 
260 265 270 

Pro lie Gin Pro Val Val Val Arg Tyr Pro His Val His Phe Asp Gin 
275 280 285 

Ser Trp Gly Asn He Ser Leu Leu Thr Leu Met Phe Arg Met Phe Thr 
290 295 300 

Gin Phe His Asn Phe Met Glu Val Glu Tyr Leu Pro Val He Tyr Pro 
305 310 315 320 

Ser Glu Lys Gin Lys Gin Asn Ala Val Arg Leu Ser Gin Lys Thr Ser 
325 ' 330 335 

His Ala He Ala Thr Ser Leu Asn Val Val Gin Thr Ser His Ser Phe 
340 345 350 

Ala Asp Leu Met Leu Leu Asn Lys Ala Thr Glu Leu Lys Leu Glu Asn 
355 360 365 

Pro Ser Asn Tyr Met Val Glu Met Ala Arg Val Glu Ser Leu Phe His 
370 375 380 

Val Ser Ser Leu Glu Ala Thr Arg Phe Leu Asp Thr Phe Val Ser Met 
385 390 395 400 

He Pro Asp Ser Ser Gly Arg Val Arg Leu His Asp Phe Leu Arg Gly 
405 410 415 

Leu Lys Leu Lys Pro Cys Pro Leu Ser Lys Arg He Phe Glu Phe He 
420 425 430 

Asp Val Glu Lys Val Gly Ser He Thr Phe Lys Gin Phe Leu Phe Ala 
435 440 445 

Ser Gly His Val Leu Thr Gin Pro Leu Phe Lys Gin Thr Cys Glu Leu 
450 455 460 

Ala Phe Ser His Cys Asp Ala Asp Gly Asp Gly Tyr He Thr He Gin 
465 470 475 480 

Glu Leu Gly Glu Ala Leu Lys Asn Thr He Pro Asn Leu Asn Lys Asp 
485 490 495 

Glu He Arg Gly Met Tyr His Leu Leu Asp Asp Asp Gin Asp Gin Arg 
500 505 510 

He Ser Gin Asn Asp Leu Leu Ser Cys Leu Arg Arg Asn Pro Leu Leu 
515 520 525 

He Ala He Phe Ala Pro Asp Leu Ala Pro Thr 
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530 



535 



<210> 20 
<211> 1128 
<212> DNA 

<213> Arabidopsis sp. 



<400> 20 

atggaaaaaa 

ataatatgtc 

ttatcagctg 

ttctttggct 

gttatcttct 

cgaacagaag 

aatatcaaat 

cacctctttg 

cagatagttt 

ggcacagatt 

cttccgatac 

gaactgagtt 

ccatctttct 

cgtatcaacc 

acattccagc 

gaaggaacag 

gccttcacca 

1020 

tatgtctctt 
1080 

ccacttgttg 
1128 



agagtgtacc 
tgatggtgtt 
tagtgttgag 
cgtggctcgc 
ctggtgataa 
ttgattggat 
atgtgcttaa 
agtttattcc 
cgagttttaa 
acacagaggc 
tgaacaacgt 
gctcacttga 
tagacaacgt 
tgacccaaat 
tcaaagacca 
agaaagagtt 
ccat:ct:gt:ac 



aaattctgat 
agtttcaaca 
gcttttcagc 
cttgtggcct 
ggttccttgc 
gtacttctgg 
gagtagtttg 
tgttgagagg 
ggatccccga 
taaatgccaa 
gctgcttccc 
cgcagtttat 
ttatggaatt 
cccaaatcaa 
gctgctcaat 
caacacaaag 
acatctcacc 



aagttgtctc 
gcttttatga 
attcgctata 
ttcctctttg 
gaggatcgag 
gatcttgcac 
atgaaattac 
agatgggaag 
gacgctttat 
aggagtaaga 
aggacaaaag 
gatgtgacca 
gagccatcag 
gaaaaggaca 
gacttttact 
aagtacctca 
ttcttctcat 



tgattagagt 
tgttgatatt 
gccgtaaatg 
agaagattaa 
tattgctcat 
tgcgtaaagg 
ctctctttgg 
tcgatgaagc 
ggcttgctct 
aatttgctgc 
gtttcgtctc 
tcggttataa 
aagttcacat 
tcaatgcttg 
ccaatggtca 
taaactgttt 
caatgatttg 



gttaagaggt 
ctgggggttc 
tgtttccttc 
caaaaccaaa 
tgcaaaccac 
ccagattggg 
ttgggcgttt 
aaacttgaga 
tttccccgag 
tgaaaatggc 
ctgcttgcaa 
aacccgctgc 
ccacatccgt 
g^taatgaac 
tttccctaac 
ggcagtgatt 
gttcaggatt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



tggcctgtgt ctacttgacc 
agactgcaaa aaatitccctc 



tctgctacgc atttcaatct tcgttctgtt 
aaattagtaa acaaataa 



<210> 21 
<211> 375 
<212> PRT 

<213> Arabidopsis sp. 
<400> 21 

Met Glu Lys Lys Ser Val Pro Asn Ser 
1 5 

Val Leu Arg Gly lie lie Cys Leu Met 
20 25 

Met Met Leu lie Phe Trp Gly Phe Leu 
35 40 

Phe Ser lie Arg Tyr Ser Arg Lys Cys 
50 55 

Trp Leu Ala Leu Trp Pro Phe Leu Phe 
65 70 

Val lie Phe Ser Gly Asp Lys Val Pro 
85 



Asp Lys Leu 
10 

Val Leu Val 
Ser Ala Val 



Val Ser Phe 
60 

Glu Lys lie 
75 

Cys Glu Asp 
90 



Ser Leu lie Arg 
15 

Ser Thr Ala Phe 
30 

Val Leu Arg Leu 

45 

Phe Phe Gly Ser 



Asn Lys Thr Lys 
80 

Arg Val Leu Leu 
95 



lie Ala Asn His Arg Thr Glu Val Asp Trp Met Tyr 
100 105 

Ala Leu Arg Lys Gly Gin lie Gly Asn lie Lys Tyr 
115 120 

Ser Leu Met Lys Leu Pro Leu Phe Gly Trp Ala Phe 
130 135 140 

Phe lie Pro Val Glu Arg Arg Trp Glu Val Asp Glu 
145 150 155 

Gin lie Val Ser Ser Phe Lys Asp Pro Arg Asp Ala 
165 170 



Phe Trp Asp Leu 
110 

Val Leu Lys Ser 
125 

His Leu Phe Glu 



Ala Asn Leu Arg 
160 

Leu Trp Leu Ala 
175 
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Leu Phe Pro Glu Gly Thr Asp Tyr Thr Glu Ala Lys Cys Gin Arg Ser 
180 185 190 

Lys Lys Phe Ala Ala Glu Asn Gly Leu Pro lie Leu Asn Asn Val Leu 
195 200 205 

Leu Pro Arg Thr Lys Gly Phe Val Ser Cys Leu Gin Glu Leu Ser Cys 
210 215 220 

Ser Leu Asp Ala Val Tyr Asp Val Thr lie Gly Tyr Lys Thr Arg Cys 
225 230 235 240 

Pro Ser Phe Leu Asp Asn Val Tyr Gly lie Glu Pro Ser Glu Val His 
245 250 255 

lie His lie Arg Arg lie Asn Leu Thr Gin lie .Pro Asn Gin Glu Lys 
260 265 270 

Asp lie Asn Ala Trp Leu Met Asn Thr Phe Gin Leu Lys Asp Gin Leu 
275 280 285 

Leu Asn Asp Phe Tyr Ser Asn Gly His Phe Pro Asn Glu Gly Thr Glu 
290 295 300 

Lys Glu Phe Asn Thr Lys Lys Tyr Leu lie Asn Cys Leu Ala Val lie 
305 310 315 320 

Ala Phe Thr Thr lie Cys Thr His Leu Thr Phe Phe Ser Ser Met lie 
325 330 335 

Trp Phe Arg lie Tyr Val Ser Leu Ala Cys Val Tyr Leu Thr Ser Ala 
340 345 350 

Thr His Phe Asn Leu Arg Ser Val Pro Leu Val Glu Thr Ala Lys Asn 
355 360 365 

Ser Leu Lys Leu Val Asn Lys 
370 3'75 

<210> 22 
<211> 1170 
<212> DNA 

<213> Arabidopsis sp . 
<400> 22 

atggtgattg ctgcagctgt catcgtgcct ttgggccttc tcttcttcat atctggtctc 60 
gctgtcaatc tctttcaggc agtttgctat gtactcattc gaccactgtc taagaacaca 120 
tacagaaaaa ttaaccgggt ggttgcagaa accttgtggt tggagcttgt atggatagtt 180 
gactggtggg ctggagttaa gatccaagtg tttgctgata atgagacctt caatcgaatg 240 
ggcaaagaac atgctcttgt cgtttgtaat caccgaagtg atattgattg gcttgtggga 300 
tggattctgg ctcagcggtc aggttgcctg ggaagcgcat tagctgtaat gaagaagtct 360 
tccaaattcc ttccagtcat aggctggtca atgtggttct cggagtatct ctttctggaa 420 
agaaattggg ccaaggatga aagcactcta aagtcaggtc ttcagcgctt gagcgacttc 480 
cctcgacctt tctggttagc cctttttgtg gagggaactc gctttacaga agccaaactt 540 
aaagccgcac aagagtatgc agcctcctct gaattgccta tccctcgaaa tgtgttgatt 600 
cctcgcacca aaggtttcgt gtcagctgtt agtaatatgc gttcatttgt cccagcaatt 660 
tatgatatga cagtgactat tccaaaaacc tctccaccac ccacgatgct aagactattc 720 
aaaggacaac cttcagtggt gcatgttcac atcaagtgtc actcgatgaa agacttacct 780 
gaatcagatg acgcaattgc acagtggtgc agagatcagt ttgtggctaa ggatgctctg 840 
ctagacaaac acatagctgc agacactttc cccggtcaac aagaacagaa cattggccgt 900 
cccataaagt cccttgcggt ggttctatca tgggcatgcg tactaactct tggagcaata 960 
aagttcctac actgggcaca actcttttct tcatggaaag gtatcacgat atcggcgctt 
1020 

ggtctaggta tcatcactct ctgtatgcag atcctgatac gctcgtctca gtcagagcgt 
1080 

tcgaccccag ccaaagtcgt cccagccaag ccaaaagaca atcaccraccc agaatcatcc 
1140 
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tcccaaacag aaacggagaa ggagaagtaa 
1170 

<210> 23 
<211> 389 
<212> PRT 

<213> Arabidopsis sp. 
<400> 23 

Met Val lie Ala Ala Ala Val lie Val Pro Leu Gly Leu Leu Phe Phe 
1.5 10 15 

lie Ser Gly Leu Ala Val Asn Leu Phe Gin Ala Val Cys Tyr Val Leu 
20 25 30 

lie Arg Pro Leu Ser Lys Asn Thr Tyr Arg Lys lie Asn Arg Val Val 
35 40 45 

Ala Glu Thr Leu Trp Leu Glu Leu Val Trp lie Val Asp Trp Trp Ala 
50 55 60 

Gly Val Lys He Gin Val Phe Ala Asp Asn Glu Thr Phe Asn Arg Met 
65 70 75 80 

Gly Lys Glu His Ala Leu Val Val Cys Asn His Arg Ser Asp He Asp 
85 90 95 

Trp Leu Val Gly Trp He Leu Ala Gin Arg Ser Gly Cys Leu Gly Ser 
100 105 110 

Ala Leu Ala Val Met Lys Lys Ser Ser Lys< Phe Leu Pro Val He Gly 
115 120 125 

Trp Ser Met Trp Phe Ser Glu Tyr Leu Phe Leu Glu Arg Asn Trp Ala 
130 135 140 

Lys Asp Glu Ser Thr Leu Lys Ser Gly Leu Gin Arg Leu Ser Asp Phe 

145 150 155 160 

Pro Arg Pro Phe Trp Leu Ala Leu Phe Val Glu Gly Thr Arg Phe Thr 
165 170 175 

Glu Ala Lys Leu Lys Ala Ala Gin Glu Tyr Ala Ala Ser Ser Glu Leu 
180 185 190 

Pro He Pro Arg Asn Val Leu He Pro Arg Thr Lys Gly Phe Val Ser 
195 200 205 

Ala Val Ser Asn Met Arg Ser Phe Val Pro Ala He Tyr Asp Met Thr 

210 215 220 

Val Thr He Pro Lys Thr Ser Pro Pro Pro Thr Met Leu Arg Leu Phe 
225 230 235 240 

Lys Gly Gin Pro Ser Val Val His Val His He Lys Cys His Ser Met 
245 250 255 

Lys Asp Leu Pro Glu Ser Asp Asp Ala He Ala Gin Trp Cys Arg Asp 
260 265 270 

Gin Phe Val Ala Lys Asp Ala Leu Leu Asp Lys His He Ala Ala Asp 

275 280 285 

Thr Phe Pro Gly Gin Gin Glu Gin Asn He Gly Arg Pro He Lys Ser 
290 295 300 

Leu Ala Val Val Leu Ser Trp Ala Cys Val Leu Thr Leu Gly Ala He 
305 310 315 320 

Lys Phe Leu His Trp Ala Gin Leu Phe Ser Ser Trp Lys Gly He Thr 
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lie Ser Ala Leu Gly Leu 
340 

lie Arg Ser Ser Gin Ser 
355 

Ala Lys Pro Lys Asp Asn 
370 

Thr Glu Lys Glu Lys 
385 



22 

330 

Gly He He Thr Leu Cys 
345 

Glu Arg Ser Thr Pro Ala 
360 

His His Pro Glu Ser - Ser 
375 380 
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335 

Met Gin He Leu 
350 

Lys Val Val Pro 
365 

Ser Gin Thr- Glu 



<210> 24 

<211> 269 

<212> DNA 

<213> Glycine max 

<400> 24 

gacccactga acgctctcat caccttcacg tggctcccct tcggcttcat cctctccatc 60 
ataagggtct acttcaacct ccctctccca gaacncattg tccgctacac ctacgagatg 120 
ctcggcatca acctcgtcat ccgcggccac cgccctcctc cgcctccccc cggcaccccc 180 
ggcaacctct acgtctgcaa ccaccgcacc gctctcgacc ccatcgtcat cgccattgcc 2 40 
ctcggccgca aggtctcctg cgtcaccta 269 

<210> 25 

<211> 242 

<212> DNA 

<213> Glycine max 



<400> 25 

tgatcttcca cgacggccgt ttcgtgcaga 
tcacgtggct ccccttcggc ttcatcctct 
tcccagaacg cattgt-ccgc tacacctacg 
gccaccgccc tcctccgcct tcccccggca 
gc 

<210> 26 

<211> 272 

<212> DNA 

<213> Glycine max 



ggccagaccc actgaacgct ctcatcacct 60 
ccatcataag ggtctacttc aaccttcctc 120 
agatgctcgg catcaacctic gtcatccgcg 180 
cccccggcaa. cctctacgtc tgcaaccacc 240 

242 



<400> 26 

gtttgttcaa aggccaactc 
catcatactc tccatnctta 
ggtataacta taagctatta 
caaagaaggg tcaaagtggt 
tggttactgc agttgcactt 



ctctagcagc cctcttgacc 
agggtctacc ttaacatccc 
ggaatcagag ttattgtgaa 
gtcctatttg tttgtaacca 
ggaagaaaaa tt: 



ttcctatggt tgccaattgg 6 0 
tttgcctgaa agaattgctt 12 0 
gggtacccct ccacoacccc 180 
ccgcacagtt ttagaccctg 240 

272 



<210> 27 

<211> 218 

<212> DNA 

<213> Glycine max 



<400> 27 

atagcacagg agggttacat ggtgcctccg agcaaatcag caaaggcagt cccacaggag 60 
cgtctgaaga gcagaatgat cttccacgac gggcgtttcg tgcagaggcc agacccaatg 12 0 
aatgccctca tcaccttcac atggctccct ttgggtttcg tcctctccat cataagggtc 180 
tacttcaacc tccctctccc agaacgcatc gtccgcta 218 

<210> 28 

<211> 270 

<212> DNA 

<213> Glycine max 



<400> 28 

gtgcctgttg ctgtgaactg caagcagaac atgttctttg gaaccaccgt tcgtggcgtc 60 
aagttctggg acccttaact tacttcttac atgaacccta ggcctgtgta cgaggttacc 120 



wo 00/18889 



23 



PCT/US99/22231 



ttaccttgat acctttgccg aggagatgtc ggttaaggct ggggggaagt cgtccattga 180 
ggtggccaac cacgtggcag aaggtgctgg gggatgtgtt agggtttgag tgcaccgggt 240 
tgactaggaa ggataagtat atgttgttgg 270 

<210> 29 

<211> 252 

<212> DNA 

<213> Glycine max 

<400> 29 

catgagggta ggtttgctca aaggccaact cctctagctg ccctcttgac cttcctatgg 60 
ctgccaattg gcatcatact ctccatctta agggtctacc ttaacatccc tttgcctgaa 120 
agaattgttg gtacaactac aagctcttag gaatcagagt tattgtgaag ggtacccctc 180 
caccgccccc aaagaagggt caaagtggtg tctatttgtt tgtaaccacc gcacagtatt 240 
agaccctgtt gt , 252 

<210> 30 

<211> 272 

<212> DNA 

<213> Glycine max 

<400> 30 

ctgggactgc cttaaacgat gcatggatct tatcaagaaa ggagcctctg tttttttctt 60 

tccagaggga acacgcagta aagatggaag actaggcaca ttcaagaagg gtgctttcag 120 

tgttgctgca aagacaaatg caccagtagt accaattacc cttattggaa ctggtcaaat 180 

catgcctgca ggaaaggagg gaatagtgaa cataggttct gtgaaagtgg ttatacataa 240 
acctattgtt ggaaaggatc ctgacatgtt at 272 

<210> 31 

<211> 239 

<212> DNA 

<213> Glycine max 

<400> 31 

cgggaatcaa ggtcatcaga cttcaagggt gtttcagctg ttgtcactga cagaattcga 60 
gaagctcatc agaatgagtc tgctccatta atgatgttat ttcaagaagg tacaaccaca 120 
aatggagagt tcctccttcc attcaagact ggtggttttt tggcaaaggc accggtactt 180 
cctgtgatat tacgatatca ttaccagaga tttagccctg cctgggattc catatctgg 239 

<210> 32 

<211> 242 

<212> DNA 

<213> Glycine max 

<400> 32 

gaacggcaac ggcaacagcg ttcgcgatga ccgtcctctg ctgaagccgg agcctccggt 60 
cttccgQcga cagcatcgcc gatatggaga agaagttcgc cgcttacgtc cgccgctacg 120 
tgtacggcac catgggacgc ggcgagttgc ctcccaagga gaagctcttg ctcggtttcg 180 
cgttggtcac tcttctcccc attcgagtcg ttctcgccgt caccatattg ctcttttatt 240 
ac 242 



<210> 33 

<211> 248 

<212> DNA 

<213> Glycine max 

<400> 33 

ttcttcttct ctcactctct aaaaccctaa ctctatacat ggaagggaaa nctcaaatct 60 
natgactaat taattaatcc atcgatcaag catggagtcc gaactcaaag acctcaattc 120 
gaagccgccg aacggcaacg gcaacagcgt tcgcgatgac cgtcctctgc tgaagccgga 180 
gcctccggtc tccgccgaca gcatcgccga tatggagaag aagttcgccg cttacgtccg 240 
ccgcgacg 248 

<210> 34 

<211> 217 

<212> DNA 

<213> Glycine max 



<400> 34 

aaaaccctaa ttctatacat ggaagggaaa tctcaaatct aatgactaat taattaatcc 60 
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atcgatcaag catggagtcc gaactcaaag acctcaattc gaagccgccg aacggcaacg 120 
gcaacagcgC tcgcgatgac cgtcctctgc tgaagccgga gcctccggtc tccgccgaca 180 
gcatcgccga tatggagaag aagttcgccg cttacgt 217 

<210> 35 

<211> 257 

<212> DNA 

<213> Glycine max 

<400> 35 

atctctgtct ctgcatttcc ctccctaaaa ccctaattct acatttggaa aggaaatctc 60 
aaatctaaCg actaattaat caatcaatcg tactaataat ccatcgatca agtatggagt 120 
ccgaactcaa agacctcaat tcgaagccac ccaactgcaa cggcaacgcc aacagcgttt 180 
gcgacgaccg tcctctgctg aagccggagc ctccggcctc ctccgacagc atcgccgaga 240 
tggagaagaa gttcgcc • ' 257 

<210> 36 

<211> 284 

<212> DNA 

<213> Glycine max 

<400> 36 

cccgaccaaa acaggttttt gtggccaatc atacttccat gattgatttc attatcttag 60 
aacagatgac tgcatttgct gttattatgc agaagcatcc tggatgggtt ggattattgc 120 
agagcaccat tntggagagt gtagggtgta tctggttcaa ccgtacagag gcaaaggatc 180 
gagaagttgt ggcaaggaaa ttgagggatc atgtcctggg agctaacaac .aaccctcttc 240 
ttatatttcc tgaaggaact tgtgtaaata atcactactc gtca 284 

<210> 37 

<211> 246 

<212> DNA 

<213> Glycine max 

<400> 37 

ggagatccgc ataagcaaat caatcatcct gttccttcct tatctctgtc tctgcatttc 60 
cctccctaaa accctaattc tacatttgga aaggaantct caaatctaat gataattaat 120 
caatcaatcg tattaataat ccatcgatca agtatggagt ccgaactcaa agacctcaat 180 
tcgaagccac ccaactgcaa cggcaacgcc aacagcgttt gcgacgaccg tcctctgctg 240 
aagccg ' 246 

<210> 38 

<211> 278 

<212> DNA 

<213> Glycine max 



<400> 38 

gttttctatt gccacgttgt ggaagcgtaa cgaagatgaa tggcattggg aaactcaaat 



60 



cgtcgagttc tgaattggac cttcacattg aagattacct accttctgga tccagtgttc 120 

aacaagaacg gcatggcaag ctccgactgt gtgatttgct agacatttct cctagtctat 180 

ctgaggcagc acgtgccatt gtagatgata cattcacaag gtgcttcaag caaatcctcc 240 

agaaccttgg aactggaatg tttatttgtt tcctttgt 278 

<210> 39 

<211> 312 

<212> DNA 

<213> Glycine max 



<400> 39 

ttaactttgg cacattctcc ttttgttcat caatgtgtgt tgtaaattgt ncatttcctt 



60 



cagaggtctt tggtaganat gatgtgcagt ttctgtggtg catcttggac tgnggntgtt 120 

aagnatcatg gacccaggcc tagcaggaga ccaaagcagg tttttgtagc caaccatact 180 

tcatgattga tntcattatn tnagaacaga tgactgcttt tgcngttatn atgcagaagc 240 

atcctggatg ggttggtaag cntacagnat gtcaacngtg tatnaaatat gntacacnnn 300 

acttgcgtct tc 312 



<210> 40 

<211> 255 

<212> DNA 

<213> Glycine max 
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<400> 40 

ggattattgn ngcanatgca gtcatctgtt ctaagataat ganatcnatc atggaagtat 60 

gattggncac anaaacctgt yttttggttg gatactaggt cttggcccat ggtacttgac 12 0 

naccccagtc catgatgcaa canaganact gnacatcatc tccaccaaac ccctctgana 180 

ganacgagaa ttgagcaatt tagagtacct tggtctgatg caagtcagta tattcaagtt 240 

tctattcatc aaagg • . 255 

<210> 41 

<211> 291 

<212> DNA 

<213> Glycine max 

<400> 41 

caacctccca tgcaatcgct caccctctcc gtcacctgaa tctgttttct attccctccg 60 
tcgcgtaaca aggatgaatg gcattgggaa actcaaatcg tcgagttctg aattggacct 120 
tcacattgaa gattacctgc cttctggatc cagtgttcaa caagaacggc atggcaagct 180 
ccgcctgtgt gatttgctag acatttctcc tagtctatct gaggcagcac gtgccattgt 240 
agatgataca ttcacaaggt gcttcaagtc aaatcctcca gaaccttgga a 291 

<210> 42 

<211> 284 

<212> DNA 

<213> Glycine max 

<400> 42 

ctgcaaccta ccatgcaatt cctcacctga atccgttttc tattgccacg ttgtggaagc 60 

gtaacgaaga tgaatggcat tgggaaactc aaatcgtcga gttctgaatt ggaccttcac 120 

attgaagatt acctaccttc tggatccagt gttcaacaag aacggcatgg caagctccga 180 

ctgtgtgatt tgctagacat ttctcctagt ctatctgagg cagcacgtgc catgtagatg 240 

atacatcaca aggtgctcaa gtcaaatctc cagaaccttg gaat 284 

<210> 43 

<211> 268 

<212> DNA 

<213> Glycine xaax 

<400> 43 

ctgaagtatt ctcgtcctag cccaaagcat agagaaaggh agcaacagaa ctttgctgag 60 

tcagtgctgc ggcgatggga ggaaaagtga tgtgtacctt tatgtggtgt tgttcttaat 120 

tattcttagt aatgccattg cttcgacccc tttttttgct tttgttttgt cattgctaac 180 

tatttatttt taacactttt attaaagata tggcatiatat: ncacttcagt anacaaagtt 240 

gtnccagtaa tttnttttcc aaaaaaaa 268 

<210> 44 

<211> 241 

<212> DNA 

<213> Glycine max 

<400> 44 

gancaaaatt gccctccatc actttccttg ttagagttgg tttctgcnac ctaccatgca 60 

attccctcac ctgaatccgt tttctattgc cacgttgtgg aagcgtaacg aagatgaatg 120 

gcattgggaa actcaaatcg tcgagttctg aattggacct tcacattgaa gattacctac 180 

cttctggatc cagtgttcaa caagaacggc atggcaagct ccgactgtgt gatttgctag 240 

a 241 

<210> 45 

<211> 247 

<212> DNA 

<213> Glycine max 



<400> 45 

gtaggatgtc tgagatcctt gccccaatca aaacggtgcg gttaactaga aaccgcgacg 



tgganta 



60 



aggatgcgaa aatgatgaaa aatttgctgg ggcaagggga cctggtggtt tgtcctgaag 120 
ggaccacatg tagagaacct tatttattga ggttcagccc tctgttctca gagatgtgcg 180 
atgagattgt ccccgttggc agttgattcc cagttatatg ttccacggaa ccactgctgg 240 



247 



<210> 46 
<211> 271 
<212> DNA 
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<213> Glycine max 
<400> 46 

tgcagggggg cttgttagag ccatagtttt ggttcttcta tacccttttg tttgtgtcgt 60 
aggaaaagag atggggttga agataatggt catggcatgc ttcttcggga tcaaagcatc 120 
gagcctcaga gttggaaggt ccgttttgcc cnaattcttc tnggaggacg ttngtgcaga 180 
aatgtttgag. gcactcaaaa aaggaggga'a gacagtggga gttaccaatt taccccacgt 240 
gatggtggaa agcttcttga gagagtattt g ' . 271 

<210> 47 

<211> 242- 

<212> DNA 

<213> Glycine max 

<400> 47 

ttcacagctg tcacgccgtn aacggaaaat ggcaacggcg agacgcagtt tcccgcctat 60 
caccgaatgc aacggaacga cnccgtgcga ntctgtngnc gccgacctcg agggtacgct 120 
cctcatctcc cgtngctcgt tcccgtactt catgctcgtc gccgtcgaag ccggcagcnt 180 
cctccgcggc ctcatgctnc tcctctccct tccgttcgtc atnatcgcct acctcttcat 240 
ct 242 

<210> 48 

<211> 244 

<212> DNA 

<213> Glycine max 

<400> 48 

acatattctt cagttagctc ccccaaccta tacacttcac caccacacca caaccctacc 60 
ctctctctct gtcatggtca ttggaggagc cttccctcgrt ttcgacccaa Ccaccaaatg 120 
tagacccaag accgctccaa ccagaccatc gcctcggacc tcgatggcac cctccttgtc 180 
tcccggagtg ccttccccta ctacttcctc gtcgccfctcg aagccggcag cgtcttccga 240 
gcct 244 

<210> 49 
<211> 230 
<212> DNA - 
<213> Glycine max 

<400> 49 

caacattcca cctagctccc caatcacatc ttcaccacac cataaacctt cttaatttct 60 
ctcttcattt tctcctctat tgtcataatc atggggacct tccctcgctt cgacccaatc 120 
accacccaag accggtccaa ccagaccgtg gcctccgacc ttgacggcac cctcctcgtc 180 
tcccggagcg ccttccccta ctacctcctc gttgccctcg aagccggcag 230 

<210> 50 

<211> 265 

<212> DNA 

<213> Glycine max 



<400> 50 

ctggtgaata atcctaagtt atggagtctg tggtgtgtga gctagaaggc acgcttgtga 60 
aggacaagga tgcgttctca tacttcatgt tggttgcgtt tgaagcttca ggtttggttc 120 
gtttcgcctt gttgctaaca ctattgcccg tgattcggtt ccttgacatg gttggcatga 180 
acgatgcatc tctcaagcta ntnatcttcg tggctgtggc tggtgttcca aagtccgaga 240 



ttgaatcagt ggctagggca gtttt 



265 



<210> 51 

<211> 252 

<212> DNA 

<213> Glycine max 



<400> 51 

ctggtgaata atcctaagtt atggagtctg tggtgtgtga gctagaaggc acgcttgtga 60 
aggacaagga tgcgttctca tacttcatgt tggttgcgtt tgaagcttca ggtttggttc 120 
gtttcgcctt gttgctaaca ctattgcccg tgattcggtt ccttgacatg gttggcatga 180 
acgatgcatc tctcaagcta atgatcttcg tggctgtggc tgggttccaa agtccgagat 240 
tgaatcagtg gc 252 



<210> 52 
<211> 218 
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<212> DNA 

<213> Glycine max 

<400> 52 

aactgcaact acaacaacat tcattcattc acagctgtca cgccgtgaac ggaaaatggc 60 

aacggcgaga cgcagtttac ccgcctatac accgaatgca acggaacgac accgtgcgag 120 

tctgtggccg ccgacctcga cggtacgctc ctcatntccc gtagctcgtt: cccgtacttc 180 

atgctcgtcg ccgtcgaagc cggcagcctc ctccgcgg • 218 

<210> 53 

<211> 262 

<212> DNA 

<213> Glycine max 

<400> 53 

ggttaaggac attgagatgg tcgnntcctc ggtgctgccc aagttctaca ccgaggacgt 60 
gcnccccgag agctggagag tcttcaatcc ttcgggaagc gttacattgt cactgctagt 120 
ctagggtgat ggtggagcan tttgttaaga cgtttcttgg ggctgataag gtgcttggga 180 
ctgagcttga ggccacgaaa tcggggaggt tcatgggttt gttaaggagc ctggtgtgct 240 
tgttggggag cacaagaaag tg 262 

<210> 54 

<211> 212 

<212> DNA 

<213> Glycine max 

<400> 54 

gcaactacaa caacattcat tcattcacag ctgtcacgcc gtgaacggaa aatggcaacg 60 

gcgagacgca gtttcccgcc tatcaccgaa tgcaacggaa cgacgccgtg cgagtctgtg 120 

gccgccgacc tcgacggtac gctcctcatc tcccgtagnc cgttcccgta cttcatgctc 180 

gtngccgtcg aagccggcag cctcctccgc gg 212 

<210> 55 

<211> 273 

<212> DNA 

<213> Glycine max 

<400> 55 . 

catggttttc ttgagcttct ttggcctcag aaaggacaca ttcagaacag gatcagctgt 60 

tctggcaaag ttcttcttag aagatgttgg attggaaggc tttgaggccg taatatgttg 120 

tgagagaaaa gtggcatcta gtaagttgcc aagggtcatg gttgaaaatt tcctcaagga 180 

ctatttaggg gttgatgctg ttatagcaag agaattgaag tcctttagtg gcttcttttt 240 

gggagttttt gagagtiaaga agccaat:taa aat 273 

<210> 56 

<211> 257 

<212> DNA 

<213> Glycine max 

<400> 56 

ctctcaaaaa aggagggaag acagtgggag tcaccaatct accccatgtg atggtggaaa 60 

gcttcttgag agagtatttg gacattgatt tcgttgtggg cagggagctg aaagttttct 120 

gtggatacta cgtaggattg atggatgaca caaaaactat gcatgccttg gagctggtta 180 

aagaaggaaa aggatgctcc gacatgatcg gaatcacaag gtttcgcaac atacgcgacc 240 

atgatgattt tttctcc 257 

<210> 57 

<211> 240 

<212> DNA 

<213> Glycine max 

<400> 57 

gaactaagtg tgaaccacta ccaagaaaca agcttttaag tccaattatt tttcatgagg 60 

gtaggtttgc tcaaaggcca actcctctag ctgnnctctt gaccttccta tggctgccaa 120 

ttggcatcat actctccatc ttaagggtct accttaacat ccctttgcct gaaagaattg 180 

cttggtacaa ctacaagctc ttaggaatca gagttattgt gaagggtacc cctccaccgc 240 

<210> 58 
<211> 254 
<212> DNA 
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<213> Glycine max 
<400> 58 

cttggaataa gggtcattag gaagggtatc cctccacccc cagcnaagaa gggccaaagt 60 
ggagtcctat ttgtatgcaa ccacaggaca gttttagacc ctgtggttac agctgttgca 120 
ttaggaagga aaattagctg tgtcacatat agcataagca aattcactga aataatttca 180 
ccaatcaaag. ctgtggcact ctctagggag agggacaaag atgctgccaa catcaagang 240 
ttgcttgagg aagg 254 

<210> 59 

<211> 267. 

<212> DNA 

<213> Glycine max 

<400> 59 

gccaganaga cttgcttggt acaactacaa gcttcttgga ataagggtca ttaggaaggg 60 
tatccctcca cccccagcaa agaagggcca aagtggagtc ctatttgtat gcaaccacag 120 
gacagtttta gaccctgtgg ttacagctgt tgcattagga aggaaaatta gctgtgtcac 180 
atatagcata agcaaattca ctgaaataat tcaccaatca aagctgtggc actctctagg 240 
gagagggacc nagatgctgc cnacatc 267 

<210> 60 

<211> 261 

<212> DNA 

<213> Glycine max 

<400> 60 

gtaaccacag ggtctaaaac tgtgcggtgg ttactgcagt tgcacttgnc nagaaaaatt 60 
tgcttatgct atatgtgaca cagctaattc accgnaataa tttcaccaat taaagctgtg 120 
gcactctcaa ggganngaga gaaagatgct gccaatatcc ngagactact tgaggaaggg 180 
gacttggtga tttgccctga aggcacaact tgtagagagc cttcctcttg aggttcagtg 240 
cactatttgc tgaactcact g 261 

<210> 61 

<211> 258 

<212> DNA 

<213> Glycine max 

<400> 61 

caaggagctc acatgcagtg gagggaaatc agctattgaa gttgcaaact acattcaaag 60 

ggttcttgca gggactttgg gatttgagtg cacaaatttg actaggaaga gcaaatatgc 120 

catgct:tgca ggcacagatg ggacagttcc atctaaggag aaggcttgan aagggagaga 180 

aattaagttc tcccttttga ttattctgta ttggtgccca atgtgtttcc aaaacactta 240 

gaattatgat agaaataa 258 



<210> 62 

<211> 258 

<212> DNA 

<213> Glycine max 

<400> 62 

attggcataa tcctctccat cctaagggtc tatctcaaca tccctctgcc agaaagactt 60 
gcttgntaca actacaagct tcttggaata agggtcatta ggaagggtat ccctccaccc 12 0 
ccagcaaaga agggccaaag tggagcctat ttgtatgcaa ccacaggaca gttttagacc 180 
ctgtggttac agctgttgca ttaggaagga aaattagctg tgtcacatat agcataagca 240 
aattcactga aataattt 258 

<210> 63 

<211> 239 

<212> DNA 

<213> Glycine max 

<400> 63 

cacttcacca ccacaccaca accctaccct ctctctctgt catggtcatt ggaggagcct 60 
tccctcgttt cgacccaatc accaaatgta gcacccaaga ccgctccaac cagaccatcg 120 
cctcggacct cgatggcacc ctccttgtct cccggagtgc cttcccctac tacttcctcg 180 
tcgccctcga agccggcagc gtcttccgag ccctccttct cttaaccttc gtccccttc 23 9 



<210> 64 
<211> 531 
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<212> DNA 

<213> Glycine max 



<400> 64 

ccgagaaccg 

ccagcgcatt 

ttgtcctcct 

ggccatcaag 

ggtcgcgtgc 

acctatacac 

ggagccttcc 

accatcgcct 

ttcctcgtcg 



gtctaaccaa 
tccttactac 
tgcctccgtc 
tccctgatct 
tcggtgctgc 
ttcaccacca 
ctcgtttcga 
cggacctcga 
ccctcgaagc 



accgtggcct 
atgctggtcg 
cctttcgtgt 
tcatcgcctt 
ccaagt tcta 
caccacaacc 
cccaatcacc 
tggcaccctc 
cggcagcgtc 



cggacttgga 
ccatcgaagc 
attcacgtac 
cgcgggcctg 
cgccgacata 
ctaccctctc 
aaatgtagca 
cttgtctccc 
ttccgagccc 



cggcaccctc 
cggcagcttc 
atattcctct 
aaggtcaggg 
ttcttcagtt 
tctctgtcat 
cccaagaccg 
ggagtgcctt 
tccttctctt 



ctggtgtccc 
ctccgtggcc 
ccgagaccgc 
acgttgagat 
agctcccc'ca 
ggtcattgga 
ctccaaccag 
cccctactac 



60 

120 

180 

240 

300 

3 60 

420 

480 

531 



<210> 65 

<211> 256 

<212> DNA 

<213> Glycine max 



<400> 65 

acatattctt cagttagctc 
ctctctctct gtcatggtca 
tagcacccaa gaccgctcca 
ctcccggagt gccttcccct 
agccctcctt ctctta 



ccccaaccta tacacttcac 
ttggaggagc cttccctcgt 
accagaccat cgcctcggac 
actacttcct cgtcgccctc 



caccacacca caaccctacc 60 
ttcgacccaa tcaccaaatg 120 
ctcgatggca ccctccttgt 180 
gaagccggca gcgtcttccg 240 

256 



<210> 66 

<211> 260 

<212> DNA 

<213> Glycine max 



<400> 66 

ccatccaaca tattcttcag 
ccctaccctc tctctctgtc 
ccaaatgtag cacccaagac 
tccttgtctc ccggagtgcc 
tcttccgagc cctccttctc 



ttagctcccc caacctatac 
atggtcattg gaggagcctt 
cgctccaacc agactatcgc 
ttcccctact acttcctcgt 



acttcaccac cacaccacaa 60 
ccctcgtttc gacccaatca 120 
ctcggacctc gatggcaccc 180 
cgccctcgaa gccggcagcg 240 

260 



<210> 67 

<211> 248 

<212> DNA 

<213> Glycine max 



<400> 67 

caccaaccaa acctcactct ccctttctcc cctgaccctc tccctgccat ggtcatggga 60 
gcctttggcc acttcgaacc ggtctccaaa tgcagcaccg agaaccggtc taaccaaacc 120 
gtggcctcgg acttggacgg caccctcctg gtgtccccca gcgcatttcc ttactacatg 180 
ctgggcgcca tcgaagccgg cagcttcctc cgtggccttg tcctccttgc ctccgtccct 240 



ttcgtgta 



248 



<210> 68 

<211> 283 

<212> DNA 

<213> Glycine max 



<400> 68 

ttcttcccca ccatcacacc aancaaacct: cactctncct ggccatggtc atgnnngcct 



60 



ttccgccact tcgaaccggt ttccaaatgc agcaccgaaa accggtttaa ccaaaccgtg 120 
gcctcggact tggacggcac cctcctggtg tcccctagcg cctttcctta ctacatgctc 180 
gtcgccatcg aagccggcag cttcctccgt ggccttgtcc tccttggatc cgtccctttc 240 
gtgtacttca cgtacatatt cttctccgag accgcggcca tea 2 83 



<210> 69 

<211> 258 

<212> DNA 

<213> Glycine max 



<400> 69 

ctcttcttcc ccaccatcnn accaaccaaa cctcactctc cctgaccatg gtcatgggag 60 
cctttcgcca cttcgaaccg gtttccaaat gcagcaccga aaaccggttt aaccaaaccg 12 0 
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tggcctcgga cttggacggc accctcctgg tgtcccctag cgcctttcct tactacatgc 180 
tcgtcgccat cgaagccggc agcttcctcc gtggccttgt cctccttgga tccgtccctt 240 
tcgtgtactt cacgtaca 258 

<210> 70 

<211> 256 

<212> DNA 

<213> Glycine max 

<400> 70 

tgcaactaca acaacattca ttcattcaca gctgtcacgc cgtgaacgga aaatggcaac 60 
ggcgagacgc agtttcccgc ctatcaccga atgcaacgga acgacaccgt gcgagtctgt 120 
ggccgccgac ctcgacggta cgctcctcat ctcccgtagc tcgttcccgt acttcatgct 180 
cgtcgccgtc gaagccggca gcntcctccg cggcctcatc ctcctcctng ccantccgtt 240 
cgtcatcanc gcctac *. 256 

<210> 71 

<211> 259 

<212> DNA 

<213> Glycine max 

<400> 71 

cttccccacc atcacaccan ggcnaacctc antctccctt tctccacnga ccctctccct 60 

gccatngtca tgggancctt tggccacttc gaaccggtct ccaaatgcag caccgagaac 120 

cggnctaacc aaaccgtggc ctcggacttg gacggcaccc tcctggtgtc ccricagcgca 180 

tttccttact acatgctggc ngccatcgaa gccggcagct tcctccgtgg ccttgtcctc 240 
cttgcctccg tccctttcg 259 

<210> 72 

<211> 249 

<212> DNA 

<213> Glycine max 



<400> 72 

ccaacatatt cttcagttag ctcccccaac 
accctctctc tctgtcatgg tcattggagg 
atgtagcacc caagaccgct ccaaccagac 
tgtctcccgg agtgccttcc cctactactt 
ncgagccct 



ctatacactt caccaccaca ccacaaccct 60 
agccttccct cgtttcgacc caatcaccaa 120 
catcgcctcg gacctcgatg gcaccctnct 180 
cctcgtcgcc ctcgaagccg gcagcgtctt 240 

249 



<210> 73 

<211> 257 

<212> DNA 

<213> Glycine max 

<400> 73 

caaccctctt cttccccacc atcacaccaa ncaaacctca ctctcccttt ctcccctgac 60 
cctctccctg ccatggtcat gggagccttt ggccacttcg aaccggtctc caaatgcagc 120 
accgagaacc ggtctaacca aaccgtggcc tcggacttgg acggcaccct cctggtgtcc 180 
cccagcgcat ntccttacta catgctggtc gccatcgaag ccggcagctt cctccgtggc 240 
cttgtcctcc ttgcctg 257 

<210> 74 

<211> 255 

<212> DNA 

<213> Glycine max 

<400> 74 

gccgaagacg tgcacccgga gagttggaga gtgttcaact ctttcgggaa gcgttacatt 60 

gtcacggcta gtcctagggt gatggtggag ccgtttgtta aggcgtttct cggggctgac 120 

aaggtgcttg ggactgaact tgaggccacc aaatcgggga cgttcactgg gtttgttaag 180 

aagcctggtg tgcttgttgg ggagcataag aaagtggctc tggtgaagga gtttcagggt 240 

aattacctga cttgg 255 

<210> 75 

<211> 244 

<212> DNA 

<213> Glycine max 



<400> 75 



• r 
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caacaacatt cattcattca cagctgtcac gccgtgaacg gaaaatggca acggcgagac 60 
gcagtttccc gcctatcacc gaatgcaacg gaacgacacc gtgcgagtct gtggccgccg 120 
acctcgacgg tacgctcctc atcncccgta gctcgttccc gtacttcatg ctcgtcgccg 180 
tcgaagccgg cagcctcctc cgcggcctca tgcnttcctg ggtttanttt gagnacccct 240 
gagg 244 



<210> 76 

<211> 240 

<212> DNA 

<213> Glycine max 

<400> 76 

gctggctacc ctcttcttcc ccaccatcac accaatcaaa cctcactcta ccctggccat 60 
ggtcatggga gcctttncgc cacttcgaac cggtttccaa atgcagcacc gaanaccggt 120 
ttnaccanac cgtggcctcg gncttggacg gcaccctcct ggtgtcccct agcgcctttc 180 
cttactacat gctcgtcgcc atcgaagccg gcagcttcct ccgtggcttg tcctccttgg 240 



<210> 77 

<211> 263 

<212> DNA 

<213> Glycine max 



<400> 77 

gtttctcggg gctgacaagg tgcttgggac 
cactgggttt gttaagaagc ctggtgtgct 
gaaggagttt cagggtaatt tacctgactt 
cttcatgtca atttgcaagg aagggtacat 
aagaaacaag cttttaagtc caa 

<210> 78 

<211> 258 

<212> DNA 

<213> Glycine max 



tgaacttgag gccaccaaat cggggacgtt 60 
tgttggggag cataagaaag tggctctggt 120 
gggtctaggt gatagtaaaa gtgattatga 180 
ggtgccaaga actaagtgtg aaccactacc 240 

263 



<400> 78 

ggccacgaaa tcggggaggt tcactgggtt 

gcacaagaaa gtggctgttg tgaaggagtt 

agatagt:aaa agtgattatg actt.catgt:c 

gactaagtigt gaaccactac caagaaacaa 
taggtttgtt caaaggcc 



tgttaaggag cctggtgtgc ttgttgggga 60 
tcagggtaat ttacctgact tgggactagg 120 
aatttgcaag gaagggtaca tggtgccaag 180 
acttttaagt ccaattattt ntcatgaggg 240 

258 



<210> 79 

<211> 260 

<212> DNA 

<213> Glycine max 



<400> 79 

ctcttcttcc ccaccatcac 
ccctgccatg gtcatgggag 
gaaccggtct aaccaaaccg 
cgcatttcct tactacatgc 
tcctccttgc ctccgtccct 



accaancaaa cctcactctc 
cctttggcca cttcgaaccg 
tggcctcgga cttggacggc 
tggtcgccat cgaagccggc 



cctttctccc ctgaccctct 60 
gtctccaaat gcagcaccga 120 
accctcctgg tgtcccccag 180 
agcttcctcc gtgggccttg 240 

260 



<210> 80 

<211> 257 

<212> DNA 

<213> Glycine max 



<400> 80 

gggaacaaca acaaatggca ngaaccttat 
atacccaatc cagcctgtaa ttgtacgcta 
tcatgtntct ttgggaaagc ttatgttcag 
ggtagaatat cttcctgtca tttatcccct 
ggagaggact agccggg 



ctccttccaa cttggtgcat ttatccctgg 60 
tcctcatgtg cactttgacc aatcctgggg 120 
aatgttcact caatttcaca acttttttga 180 
ggatgataag gaaactgctg tancttntcg 240 

257 



<210> 81 

<211> 272 

<212> DNA 

<213> Glycine max 
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<400> 81 

catacctttt gttggcacca ttattagagc aatgcaggtc atatatgtta acagattctt 60 

accatcatca aggaagcagg ctgttaggga aataaaggaa ctgaataaca gagaagggcc 120 

tcttgtgata aatttcctcg agtactatta tttcccgagg gaacaacaac taatggcagg 180 

aaccttatct ccttccaact tggtgcattt atccctggat acccaatcca gcctgtaatt 240 

atacgctatc ctcatgtaca ctttgaccaa tc 272 

<210> 82 

<211> 245 • ' ' • • 

<212> DNA 

<213> Glycine max 

<400> 82 

gggcatttca catactagag ttcatcccag tgaaaagaaa gtgggaggct gatgaatcaa 60 
tcatgcgcca tatgctttct acattcaagg atccacaaga tcctctctgg cttgcgcttt 120 
tcccagaagg cactgatttc actgagcaaa agtgccttcg gagtcaaaaa tatgctgctg 180 
aacataagtt accggttctg aaaaatgttt tacttccaag gacaaagggg cttctgtgcc 240 
gcttg 245 

<210> 83 

<211> 268 

<212> DNA 

<213> Glycine max 

<400> 83 

cagtgtcctt cctttctgga caatgttttt ggtgttgacc cttcagaagt gcacctgcat 60 

gtgcggcgta ttccggtgga ggagattcca gcttctgaaa ccaaagctgc ttcttggtta 120 

atcgacacat tccagatcaa ggaccaattg ctttcggatt tcaagattca aggccatttc 180 

cctaaccaac taaatgaaaa tgaaatttct agatttaaga gcctactctc ttttatggtg 240 

atagtttctt ttactgccat gtttattt ; 268 

<210> 84 

<211> 265 

<212> DNA 

<213> Glycine max 

.<400> 84 • : 

gaaagagact gggcaaaaga tgaaacatca ctgaagtcag gttttaggca tctagagcac 60 

atgccattcc ctttctggtt ggcccttttt gttgaaggaa ctcgtttcac gcagacaaag 120 

cttttacaag ctcaagagtt tgctgcttca aaagggctgc ctatacctag aaatgttttg 180 

attcctcgta ctaagggttt tgtcacagca gnacaaagcc ttcggccatt tcgttccagc 240 

catttatgat tgcacatatg. cagtt 265 

<210> 85 

<211> 265 

<212> DNA 

<213> Glycine max 

<400> 85 

gaaagagact gggcaaaaga tgaaacatca ctgaagtcag gttttaggca. tctagagcac 60 

atgccattcc ctttctggtt ggcccttttt gttgaaggaa ctcgtttcac gcagacaaag 120 

cttttacaag ctcaagagtt tgctgcttca aaagggctgc ctatacctag aaatgttttg 180 

attcctcgta ctaagggttt tgtcacagca gnacaaagcc ttcggccatt tcgttccagc 240 
catttatgat tgcacatatg cagtt 265 

<210> 86 

<211> 301 

<212> DNA 

<213> Zea mays 



<400> 86 

ctcgtcgtca 

gtctgcaacc 

gtcagctgcg 

gtcgcgctgt 

gcgacctggt 

g 



agggcacccc 
accgcaccgt 
tcacctacag 
cgcgggaggc 
catctgcccc 



gccgccgccg 
gctcgacccc 
catctccaag 
gacaaggacg 
gagggnaaca 



cccaagaagg 
gtcgaggtgg 
ttctccgagc 
ccgagaacat 
actgccgcga 



gccacccggg 
ccgtggcgct 
tcatctcgcc 
ccgccgcctg 
gcccttcctg 



cgtcctcttc 60 
gcgccgcaag 12 0 
catcaaggcc 180 
ctggaggagg 240 
ctgcgttcag 300 
301 



<210> 87 
<211> 309 
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<212> DNA 
<213> Zea mays 

<400> 87 

cgctcatgcg gtgtacatca acctgccgct 
gctcatgggc atcaggctcg tcgtcaaggg 
cccgggcgtc ctcttcgtct gcaaccaccg 
ggcgctgcgc cgcaaggtca gctgcgtcac 
ctcgcccatc aaggccgtcg cgctgtcggg 
gcctgctgg 



gcccgagcgc atcgtctact acacctacaa 60 
caccccgccg ccgccgccca agaagggcca 120 
caccgtgctc gaccccgtcg aggtggccgt 180 
ctacagcatc tccaagttct ccgagctqat 240 
gaggcgacaa ggacgccgag aacatccgcc 300 



<210> 88 
<211> 304 
<212> DNA 
<213> Zea mays 



<400> 88 

tggctgtgca ggaggcctac ctggtgacgt 
agctgctgag cccgctgatt cgtgcacgac 
gtcgcgctcg tcaccttcct ctggatgccg 
tacatcaacc tgccgctgcc cgagcgcatc 
aggctcgtcg tcaagggcac cccgccgccg 
ttcg 



caaggaagta cagcccggtg cccaggaacc 60 
ggccgcctcg tgcagcgccc gacgccgctc 120 
ttcggcttcg cgctggcgct: catgcgcgtg 180 
gtctactaca cctacaagct catgggcatc 240 
ccgcccaaga agggccaccc gggcgtcctc 3 00 

304 



<210> 89 
<211> 312 
<212> DNA 
<213> Zea mays 



<400> 89 

ggttcatcca cttgtgttgc tattngaccg 
caaagatttn gggctacggt gacaatctcc 
gagaatctgc ctccaaatag ctgtcctggt 
gatatttata cccttctaac tctagggagg 
tttatgttcc ctattatagg gtgggcaatg 
atggacagca gg 

<210> 90 

<211> 264 

<212> DNA 

<213> Zea mays 



gtaccgtagg agagcacagc actancatcg 60 

atgttctaca atcttnaggt cgaaggaatg 120 

gtctatgttg ctaaccatca gagcttcttg 180 

tgcttcaaat ttataagcaa gaccagcatc 2 40 

tatctcttgg gtgtgattcc tctgcggcgt 3 00 



<400> 90 

ggtgctgtat ctgaaagaat ccatcgtgct 
ctcttcccct gagggcacaa ctacaaatgg 
ttttcttgca aaggcaccag ttcaaccagt 
tgcagcatgg gattccatgt caggggcacg 
aaattaccta gaggtggtcc gctt 



catcaacaga aaaatgcacc aatgatgcta 60 
ggattatctc cttccattca aaacaggtgc 120 
cattttgaga tatccttaca aaagatttaa 180 
tcatgtattt ctgctgctct gtcaatttgt 240 

264 



<210> 91 
<211> 212 
<212> DNA 
<213> Zea mays 



<400> 91 

aaatgtcttg gatgcatttt tgttcagcgg 
tcaggtgctg tatttgaaag aatccatcgt 
ctactcttcc ctgagggcac aactacaaat 
gcttttcttg caaaggcacc agttcaacca 



gagtcgaaaa caccagattt caaaggtgtt 60 
gctcatcaac agaaaaatgc accaatgatg 12 0 
ggggattatc tccttccatt caaaacaggt 180 
gt 212 



<210> 92 
<211> 267 
<212> DNA 
<213> Zea mays 



<400> 92 

gtctaaagaa atngaaaggc gtggggnaat 
tctttatcan atgtcagcct cttttcctag 
gcctctagtt ggtctcataa gcaaatgtct 
aatncanatt tcaaaggtgt ttaaggtgtg 



tgtgtctaat catgtntctt atgtggatat 60 
ttttgttgct aagagatcag tggntagatt 120 
tggatgcatt tttgttcagc gggagtnnaa 180 
gnatctgaaa gaatccatcg tgctcatcaa 240 
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cagaaaaatig caccaaCgat gctactc 



267 



<210> 93 
<211> 152 
<212> DNA 
<213> Zea mays 

<400> 93 

ctacaaatgg ggattacctt cttccattta agactggagc ctttnttgca ggCgcaccag 60 
tgcagccagt cattttgaaa tacccttaca ggagatttag tccagcatgg gattcaatgg 120' 
atggagcacg tcatgtgtta ttgctgctct gt 152 

<210> 94 
<211> 274 
<212> DNA 
<213> Zea mays 

<400> 94 

aaaatataaa ttaatatggt cttaatccca ccatataaat aacgttctct titictigcaggg 60 

caatttagtt ctttctaata ttgggctggc agagaagcgc gtgtaccatg cagcactgac 120 

tggtiagtagt ctacctggcg ctagacatga gaaagatgat tgaaagacgt tgcgtcgctt 180 

tttctgtaac agacagccga ggaacactta aaaatgtaac tgtgtgcgtg tttttatacc 240 
tgtaatgtgg cagtttattt gtttgaggag gctg 274 

<210> 95 
<211> 295 
<212> DNA 
<213> Zea mays 



<400> 95 

aatagctatc 

ttttacaatg 

cttacctcct 

ggacatgata 

caaccgtcct 



aagtacaata 
cacttggtcc 
caatatctga 
gctgctagag 
agtcccaaac 



aaatatttgt 
ggctgatgac 
gggagggaga 
ctggactaaa 
acactgaaga 



tgatgccttt 
atcatgggct 
gacggcaatt 
gaaggttcct 
gaacaacgca 



tggaacagta agaagcaatc 60 
gttgtgtgtg atgtttggta 120 
gcatttgctg agagagtaag 180 
tgggatggct atctgaaaca 240 



tattgccgat ctgtc 



295 



<210> 96 
<211> 273 
<212> DNA 
<213> Zea mays 

<400> 96 

gngccatctc accggcggcn ggcctgcggc cggcaaccgg aggcgatggc gagctngtct 60 
gtggtggcgg acatggagca ntaccgcccc aacctggagg actacctccc gcccgactcg 120 
ctcccgcagg aggcgcccag gaatctccat ctgcgcgatc tgcttgacat ctcgccggtg 180 
ctaaccgagg cagcgggtgc catagtcgat gattcattca cccgttgctt taagtcgaat: 240 
tctccagaac catggaatgg aacatatatt tgt 273 

<210> 97 
<211> 127 
<212> DNA 
<213> Zea mays 



<400> 97 

ctcaatatct ganggaggga gagactgcaa ttgcgtttgc tgagagagta agggacatga 60 
tagcagctag agctggtctt aagaaggtcc cgtgggatgg ctatctgaag cacaaccgcc 120 
ctagtcc 127 

<210> 98 
<211> 286 
<212> DNA 
<213> Zea mays 

<400> 98 

gaaccgtacg cgcctcatta cgcccatcca cgtgctcgcc tctccccatc gcataatttt 60 
nctcggcggc gtcgccatct ccancggcng cnggcctgcn gccggcaacc ggaggcgatg 120 
gcgagctcgt ctgtggcggc ggacatggag ctggaccgcc ccaacctgga ggactacntc 180 
ccgcccgant cgctcccgca ggaggcgacc aggaatctcc atctgngcga tctgcttgan 240 
atctcgccgg tgctaaccga ggcagcgggt gccatagtcg atgatt 2S6 
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<210> 99 
<211> 308 
<212> DNA 
<213> Zea mays 



<400> 99 

cgccatctca 

tcgtctgtgg 

gactcgrmcc 

cggtgctcac 

caaattctcc 

ataataag 



tcggcggcgg 
cgccggacat 
cgcagaggcg 
cgaggcagcg 
agagccatgg 



gcgtgcggcc 
ggagctggac 
ccccggaatc 
ggtgccattg 
aattggaaca 



ggcggcngag 
cgcccanacc 
tccanc.tgcg 
tcgatgactc 
tatatctgtt 



gcgaggngcg attggcgagc 60 
tggaggacta nctcccgccc 120 
cgatctgctg gacatcncgc 180 
cttcacacgg ngctttaagt 2 40 
ccccttatgt gctttggtgt 3 00 

308 



<210> 100 
<211> 282 
<212> DNA 
<213> Zea mays 



<400> 100 

cagaaactag angttagtca cagcatggca 
gagcaactat gcaatttaat gccatgctgt 
ctgtttggct actaggaaga ccgaggtaga 
canccaaatg acagagtaaa tgaaggtagg 
gttgttaaca caagttcctc tgggaaaatc 



ttaaattgtc atagtaaaca acancncact 60 
gactaacttc tagtttctgg cattaaatta 120 
gaagcaaata taagaatacc ctccaacgca 180 
gttcaccttc ttgaacatga ccgtatactg 240 
agagagggtt tt 2 82 



<210> 101 
<211> 282 
<212> DNA 
<213> Zea mays 

<400> 101 

ggcgcggctg gccgtggcgc tggtcctgcc gtacagtact cgacgccgat cctggcngcg 60 

acnggcatgt cgtggcggct caaagggtng cgcccngngc ttgcnnngcc gtgctccggc 120 

gggcgctgnc agctgttcgt gtgcaacnac cggacgctga tcgacccngt gtacgtgtcc 180 

gtagcgtgga ccgggaaatg cgcgncgtgt nctacagnct gangcggntn tcggagctca 240 

tctcccccat ngncggaang tgcacctgan accgggaacg gg 2 82 

<210> 102 
<211> 290 
<212> DNA 
<213> Zea mays 



<400> 102 

ggacgcggca ccatgcgcgc cgagctggcc 
accacgtgcc gggagccctt cctgctccgc 
aggatcgtgc ccgtggcgat gaactaccgc 
gggtggaaag ccatggaccc catcttcttc 
cgttcctgaa ccantccccg caaagcgacg 



agtggcgacg tggccgtgtg ccccgagggc 60 
ttctccaagc tcttcgcgga gctcagcgac 120 
gtggggctct tccacccgac gacggcgcgc 180 
ttcatgaacn gcggcccgtg tacgaggtga 240 
tgcgcggcgg ggaagagccc 290 



<210> 103 
<211> 279 
<212> DNA 
<213> Zea mays 



<400> 103 

acgaggtgac gttcctgaac cagctccccg 
ccgttgatgt agccaactac gttcagcgga 
ccaccctcac aaggaaggac aaatacacgg 
ccaagccggc ggcggcccgg aagccggctt 
tctgctccac taacaattac accttgccca 



cagaggcgac gtgcgcggcg gggaagagcc 60 
tactcgctgc cacgctcggg ttcgagtgca 120 
tgctcgccgg caacgacggc gtcctgaacg 180 
ggcagagccg cgtgaaggaa gtcctcgggt 240 
gatctggac 279 



<210> 104 
<211> 315 
<212> DNA 
<213> Zea mays 



<400> 104 

gcccgagcgc atcgtctact acacctacaa 
caccccgccg ccgccgccca agaagggcca 
caccgtgctc gaccccgtcg aggtggccgt 



gctcatgggc atcaggctcg tcgtcaaggg 60 
cccgggcgtc ctcttcgtct gcaaccaccg 120 
ggcgctgcgc cgcaangtca gctgcgtcac 180 
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tacagcatct ccaagttctc cgagctcatc tcgcccatca aggccgtagc agnaaagcag 240 
gtcgcaaatg gagcagnagc gagtcgatgg aagngaattg gcgactggtc atccgcncga 300 
aggnacacCg cggag 315 

<210> 105 
<211> 314 
<212> DNA 
<213> Zea mays 



<400> 105 

cgagacaccg 

aggagtgctg 

tgctcatctc 

tcctccgcgc 

tctccgagtc 

gcanatcgag 



agcacgtact 
ctcggagggg 
caggagcgcg 
cgcgctgctg 
gctggccatc 
atgg 



accagcaaga 
cggtcggagc 
ttcccctact 
ctcctgtccg 
agcacgctgg 



tggtggcgtc 
agacggtggc 
acctcctcgt 
tgccgttcgt 
tgtacatctc 



tcccagattc 
cgccgacctg 
ggctctcgag 
ctacgtcacc 
cgtggcgggg 



aagcccatcg 60 
gacggcacgc 120 
gccggcagcg 180 
tacgccttct 240 
ctoaaggcgc 300 
314 



<210> 106 
<211> 291 
<212> DNA 
<213> Zea mays 



<400> 106 

ctctgggtct 

gattcaagcc 

acctggacgg 

tcgaggccgg 

tcacctacgc 



ggggccgaga caccgagcac gtactaccag caagatggtg gcgtctccca 60 
catcgaggag tgctgctcgg aggggcggtc ggagcagacg gtggccgccg 120 
cacgctgctc atntccagga gcgcgttccc ctactacctc ctcgtggctc 180 
cagcgtcctc cgcgccgcgc tgctgctcct gtccgtgccg ttcgtctacg 240 
cttcttctcc gagtcgctgg ccatcagcac gctggtgtac a 291 



<210> 107 
<211> 300 
<212> DNA 
<213> Zea mays 



<400> 107 

gcacgcagca 

ccagcaagat 

ggtcggagca 

tcccctacta 

tcctgtccgt 



gtacgacgt:c 
ggtggcgtct 
gacggtggcc 
cctcctcgtg 
gccgttcgtc 



tctcctctgg 
cccagattca 
gccgacctgg 
gctctcgagg 
tacgtcacct 



gtctggggcc 
agccca^cga 
acggcacgct 
ccggcagcgt 
acgccttctt 



gagacaccga 
ggagtgctgc 
gctcatctcc 
cctccgcgcc 
ctccgagtcg 



gcacgt:acta 60 
tcggaggggc 120 
aggagcgcgt 180 
gcgctgctgc 240 
ctggccatca 300 



<210> 108 
<211> 284 
<212> DNA 
<213> Zea mays 

<400> 108 

gnggccgaga caccgagcac gtactaccag cangatggtg gcgtctccca gatt<:angcc 60 
antcgaggag tgctgctcgg aggggcggtc ggagcagacg gtggccgccg acctggacgg 120 
cacgctgctc atctccagga gcgcgttccc ctacnacctc ctcgtggctc tcgaggccgg 180 
cagcgtcctc cgcgccgcgc tgctgctcct gtccgtgccg ttcgtctacg tcactacgcc 240 
ttcttctccg agtcgctggc catcaanacg ctggtgtaca tctc 284 

<210> 109 
<211> 280 
<212> DNA 
<213> Zea mays 



<400> 109 

ctcctctggg tctggggccg agacaccgag cacgtactac cagcaagatg gtggcgtctc 60 
ccagattcaa gcccatcgag gagtgctgct cggaggggcg gtcggagcag acggtggccg 120 
ccgacctgga cggcacgctg ctcatctcca ggagcgcgtt ccnctactac ctcctcgtgg 180 
ctctcgaggc cggcagcgtc ctccgcgccg cgctgctgct cctgtccgtn ccgttcgtct 240 



acgtcaccta cgcnttnttc tccgagtcgc tggccatcag 

<210> 110 
<211> 287 
<212> DNA 
<213> Zea mays 



280 
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<400> 110 

cgtctctcct ctgggtctgg ggccgagaca ccgagcacgt actaccagca agatggtggc 60 

gtctcccaga ttcaagccca tcgaggagtg ctgctcggag gggcggtcgg agcagacggt 

ggccgccgac ctggacggca gctgctcatc tccaggagcg cgttccccta ctacctcctc 

gtggctctcg aggccggcag cgtcctccgc gccgcgctgc tgctcctgtc cgtgccgttc 
gtctacgtca ctacggcttc ttctccgagt cgctggccat: cagcacg 



120 
180 
240 
287 



<210> 111 
<211> 286 
<212> DNA 
<213> Zea mays 



<400> 111 

cgcacagtta 

gcaagatggt 

cggagcagac 

cctactactc 

gtgcgttcgt 



cgacgtctct cctctgggtc tggggccgag acaccgagca cgtactacca 60 
ggcgtctccc agattcaagc ccatcgagga gtgctgctcg gaggggcggt 120 
ggtggccgcc gacctggacg gcacgctgct catctccagg agcgcgttcc 180 
ctcgtgctct cgaggccggc aggtcctccg cgccgcgctg tgctcctgtc 240 
ctagtcacta cgcttttctc gancgtggca ataana 2 86 



<210> 112 
<211> 323 
<212> DNA 
<213> Zea mays 



<400> 112 

gttattccct 

attcatacct 

tcaatcatgg 

taatttcatg 

tgcccttcat 

aacttcctat 



gaaggtacca caacaaatgg gagattcctg atttcgttcc aacatggtgc 60 
ggctaccctg ttcaacctgt tgttgtccgt tatccacatg tgcactttga 120 
gggnatatat cgttattaaa gctcatgttt aagatgttca cccaatttca 180 
gaggtagagt accttcctgt tgtctaccct cctgagatca agcaagagaa 240 
tttgcggagg ataccagcta tgctatggca cgtgccctca atgtcttgcc 3 00 
tcatatggtg att 



323 



<210> 113 
<211> 312 
<212> DNA 
<213> Zea mays 



<400> 113 

cgataaggcc 

tgtggcttca 

cagatgagga 

ggagtgatat 

gtacacttgc 

ggtttgcaga 



cttttcgaag 
gcttgtctgg 
aacttacaga 
tgattggctc 
tgtcatgaag 

gt 



agcttctacc gtcggatcaa cagattcttg gccgagctgc 60 
gtggtggact ggtgggcagg tgttaaggta caactgcatg 120 
tcaatgggta aagagcatgc actcatcata tcaaatcatc 180 
attggatgga tattggccca gcgttcaggg tgccttggaa 240 
aagtcatcca agttccttcc agttattggc tggtcaatgt 300 



<210> 114 
<211> 279 
<212> DNA 
<213> Zea mays 



<400> 114 

agtggggtct ccaaaggttg aaagacttcc 
agggtactcg ctttactcca gcaaagcttc 
gcttaccagc tcctagaaat gtacttattc 
gtattatgcg agattttgtt ccagccattt 
cccctcaacc aacaatgctg cggattttga 



ctagaccatt ttggctagct ctttttgttg 60 
tcgcagctca ggagtatgcg gcttcccagg 120 
cacgtaccaa gggatttgta tctgccgtaa 180 
acgatacaac tgtaatagtt cctaaagatt 240 
aagggcaat 279 



<210> 115 
<211> 304 
<212> DNA 
<213> Zea mays 



<400> 115 

cgtcaacgcc atccaggccg tcctatttgt 
ccgtcggatc aacagattct tggccgagct 
ctggtgggca ggtgttaagg tacaactgca 
taaagagcat gcactcatca tatcaaatca 
atattggccc agcgttcagg gtgccttgga 
agtt 



gacgataagg cccttttcga agagcttcta 60 
gctgtggctt cagcttgtct gggtggtgga 120 
tgcagatgag gaaacttaca gatcaatggg 180 
tcggagtgat attgattggc tcatggatgg 240 
agtacattgc tgtcatgaag aagtcatcca 3 00 

304 
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<210> 116 
<211> 259 
<212> DNA 
<213> Zea mays 



<400> 116 

cttcctcctg tccggcctca tcgtcaacgc 
gcccntttcg aagagcttct aacgtcggat 
tcagcttgtc tgggtggtgg acnggtgggc 
ggaaacttac agatcnatgg gtanagagca 
tattgattgg cncattgga 

<210> 117 
<211> 235 
<212> DNA 
<213> Zea mays 



catccaggcc gtcctatttg tgacgatajag 60 
caacagattc ntggccgagc tgctgtggct 120 
aggtg.ttaag gtacaactgc atgcngatga 180 
tgcactcatc atatcaaatc atcggagtga 240 

259 



<400> 117 ^ 
attccacgta ccaagggatt tgtatctgct gtaagtatta tgcgagattt tgttccagcc 60 
atttatgata caactgtaat agttcctaaa gattcccctc aaccaacaat gctgcggatt 120 
ttgaaagggc aatcatcagt gatacatgtc cgcatgaaac gtcatgcaat gagtgagatg 180 
ccaaaatcag atgaggatgt ttcaaaatgg tgtaaagaca tttttgtggc aaagg 



235 



<210> 118 
<211> 282 
<212> DNA 
<213> Zea mays 



<400> 118 

tgagatgcca 

ggatgcctta 

cggccgccca 

tgccatcgag 

tgccgcagga 



aaatcagatg 
ctggacaaac 
gtgaaatcat 
ttcttcaagt 
tggcgctcgt 



atgacgtttc aaaatggtgt aaagacattt ttgtgacaaa 60 
atttggcaac aggcactttc gatgaggaga ttagacctat 120' 
tgctggtgac cctgttttgg tcgtgcctgc tgttgtttgg 180 
ggacgcagct cctatcgaca tggagaggag tggcattcac 240 
gacaggggtc atgcacgtct tc 282 



<210> 119 
<211> 166 
<212> DNA 
<213> Zea mays 

<400> 119 ^ 
ctggtgggca ggcgttaagg tacaactaca tgcggatgag gacacttacc gatcaatggg 60 
taaagagcat gcactcgtca tatcaaatca tcgaagtgat attgattggc ttattggatg 120 
gatattggcc cagcgctcag ggtgccttgg aagtacgctc gctgtc 166 



<210> 120 
<211> 234 
<212> DNA 
<213> Zea mays 

<400> 120 

agtcanccaa gntccttcca gtcattggct 
nggagaggag ctgggccaag gatgaaaaga 
acttccctag accatttngg ctagctcttn 
angnttntng aggnnncagn agnnncgggn 



ggtcaatgtg gtttgcagag tacctctttt 60 
cactaaagtg gggtctccaa aggttgaaag 120 
tttgtngagg gnantcgctt tactccagca 180 
ttcccanggg ttaacagncc cana 234 



<210> 121 
<211> 210 
<212> DNA 
<213> Zea mays 



<400> 121 

gtgagatgcn aaaatcagat gatgacgttt 
aaggatgcct tactggacaa acatttggca 
atcggccgcc cagtgaaatc atngctggtg 
ggtgccatcg agntcttcaa gtggacgcag 



caaaatggtg taaagacatt tttgtggaca 60 
acaggcactt tcgatgagga gattagacct 120 
accctgtnnt ggtcgtgcct gctgttgttt 180 

210 



<210> 122 
<211> 274 
<212> DNA 
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<213> Zea mays 



<400> 122 

acncccgaat ccgccgcgcg cgcnccgtcc 
cacagcagcc tatcgccgga gaaggaacgc 
tctgacccct ccgagatcgn aagcggcggc 
cccgctcggc ctcctcttcc tcctgtccgg 
atttgtgaca ataaggccct tttccaagag 



tcgtcgccgg cggaggcgcc cgcnaccgcc 60 

cgcggggagc ttttccacng ccatctcccg 120 

catggcgatc ccgctcgtgc tcgtcgtgct 180 

cctcatcgtc aacaccatcc aggccatcct 240 
cttg 274 



<210> 123 
<211> 305 
<212> DNA 
<213> Zea mays 



<400> 123 

ttgcactgag gaaaggccat tagggatata 
agttgcctat ttttagctgg gcatttcaca 
gggagattga tgaagcaatt attcagaaca 
ctatctggtt ggcggttttt cctgaaggca 
gtcaagagta tgcttcagaa catggcttgc 
caagg 



tcaagtacat acataagagc agcttgatga 60 
tttttgagtt tatcccggta gaacggaaat 120 
agctatcaaa atttaagaac ccgagagatc 180 
cggattatac tgagaagaaa tgcatcatga 240 
ctatgctaga acatgtcctc ctttcaaaga 300 

305 



<210> 124 
<211> 279 
<212> DNA 
<213> Zea mays 



<400> 124 

ccagattttc tggacaatgt gtatggcgtt 
atggttcagc tccatcacat ccccacaaca 
aggtttaggc agaaggacca gctcctggca 
aaaggaactg aaaggagatc tgtcgacgcc 
tatgcttgac ggccnatctg gtttgtacct 



gatccttctg aagtccacat ccacgtcaga 60 
gaagacaaga taacagaatg gatggncgag 120 
gatttcttca tgaaggggca tttcctgatg 180 
gagtgcctgg caaactttct taaccagtag 240 
aaactcttt 279 



<210> 125 
<211> 219 
<212> DNA 
<213> Zea mays 



<400> 125 

agattttntg gacaatgtgt atggngttga 
ggttcagctc catcacatcc ccacaacagn 
gtttaggcag aaggaccagc tcctggcaga 
aggaactgaa ggagatctgt cgacgccgaa 



tccttntgaa gtncacatcc acgtnagaat 60 
agacaagata acagaangga tggtagagag 120 
tttcttcatg aaggggcact ttcctgatga 180 
gtgcctggc 219 



<210> 126 
<211> 293 
<212> DNA 
<213> Zea mays 



<400> 126 

taccatagat gctgtgtacg acatcacgat 
ngacaacgtc tacngcgtgg ntccttcgga 
ctccgacata ncggcgtccg aaaaacgggg 
gcntnganna acgagctngc tgttcggggc 
cgaacgaaag ggaaaaaggg gaaccgaagg 



cgcntacaaa caccggcngc ngacatttct 60 
agtccacatc cacatcanca gcatccaggt 120 
tggctggcng gntnngtgga gcggttcaag 180 
tttctaccgc ggctggggcc aatttcnccc 240 
ggggaacctg ttngaacggg ncc 293 



<210> 127 
<211> 6 
<212> PRT 

<213> conserved sequence 
<400> 127 

Val Xaa Asn His Xaa Ser 
1 5 



<210> 128 
<211> 6 
<212> PRT 
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<213> conserved sequence 
<400> 128 

Val Thr Tyr Ser Xaa Ser 

1 5 ; . 

<210> 129 
<211> 7 
<212> PRT 

<213> conserved sequence 
<400> 129 

Val Xaa Leu Thr Arg Xaa Arg 
15 



<210> 130 
<211> 5 
<212> PRT 

<213> conserved sequence 
<400> 130 

Cys Pro Glu Gly Thr 
1 5 



<210> 131 
<211> 5 
<212> PRT 

<213> conserved sequence 
<400> 131 

lie Val Pro Val Ala 
1 5 



<210> 132 
<211> 7 
<212> PRT 

<213> conserved sequence 
<400> 132 

Leu Xaa Xaa Gly Asp Leu Val 
1 5 



<210> 133 
<211> 6 
<212> PRT 

<213> conserved sequence 
<400> 133 

Phe Xaa Xaa Gly Ala Phe 
1 5 



<210> 134 
<211> 6 
<212> PRT 

<213> Synthetic Oligonucleotide 
<400> 134 

Val Ala Asn Xaa Xaa Gin 
1 5 



<210> 135 
<211> 30 
<212> DNA 
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<213> Synthetic Oligonucleotide 
<400> 135 

ccatccgctt caagggaacg acacccatca 30 

<210> 136 
<211> 31 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 136 

tccctgtctt gcttgatgaa cttaaagctt g 31 

<210> 137 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 137 

acagcaggag tgtctgatga tggcagattc 3 0 

<210> 138 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 138 

actggagttc cagccaaaaa tgcacctgtc 3 0 

<210> 139 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 139 

gatacaccct tgaaatcagg cgattttgct 30 

<210> 140 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 

<400> 140 

ttgcaaattc aattcctgtt tcaccgggcc 3 0 

<210> 141 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 141 

gttttctgct attccagaag gcgtcaacaa 30 

<210> 142 
<211> 32 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 142 

cattgaagat ccgtccgtga agttncctta cc 32 

<210> 143 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 143 

tcgagctgtg atcgatgatt ggctgtgaag 3 0 



<210> 144 
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<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 



<400> 144 

gtctcttcaa aaacacacac acacgtctct 



30 



<210> 145 
<211> 30 
<212> DNA 



<213> Synthetic Oligonucleotide 



<400> 145 



gtctcttcaa aaacacacac acacgtctct 



30 



<210> 146 
<211> 30 
<212> DNA 



<213> Synthetic Oligonucleotide 



<400> 146 

gtagagagcc ttacttgctt cggtttagtc 



30 



<210> 147 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 147 

acgtcatcgt acctgttgct attgactcac • 30 

<210> 148 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 148 

acttttccat tgtcagggac tcctcgacac .30 

<210> 149 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 149 

acggtgtagg aagggaaagg attcaaaagg 30 

<210> 150 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 150 

gcgatgaact acagagtcgg attcttcctc 3 0 

<210> 151 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 151 

ccggtttacg agattacgtt cttgaaccag 30 

<210> 152 

<211> 30 

<212> DNA 

<213> Synthetic Oligonucleotide 



<400> 152 

caatggagac aaggctcgaa agtgctaacc 



30 
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<210> 153 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 

<400> 153 

attctctgaa catagttcgc cacggtcatg .30 

<210> 154 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 154 

gaaatccaac gccttcccaa tatcactctg 30 

<210> 155 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 155 

cttcaacttt ccatcaggat cttggcacgt 30 

<210> 156 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 156 

accacttgtt agagacctta cctgcttagg 3 0 

<210> 157 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 157 

tcctacctac accatccaat ttctcgaccc 30 

<210> 158 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 158 

ctgcgtcaag tgagcaactc agttcttgca 30 

<210> 159 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 159 

tgggaagcag cacgttgttc agtatcggaa 30 

<210> 160 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 160 

tagcctctgt gtaatctgtg ccctcgggga 30 

<210> 161 

<211> 1702 

<212> DNA 

<213> Sinunondsia chinensis 
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<400> X61 

gaattceagc 

ctctaaaacc 

tagtataat.t 

cggctgtgat 

ttcaggcaat 

acagggtigct 

gtgttaag'at 

cacttgtigat 

agagatcagg 

cggtcatagg 

aggatgaaag 

ggttggctct 

aatatgctac 

gatttgtttc 

tggccatccc 

ccacggtitca 

atgttgcaca 

1020 

atgtagatga 
1080 

tctttgtagc 
1140 

ggtcgtccct 
1200 

tcaccattct 
1260 

aggtagcccc 
1320 

agcagcacta 
1380 

attcaactgt 
1440 

aagagcct:aa 
1500 

tatcagaatt 
1560 

atagtatctt 
1620 

tgagcattgt 
1680 

aattcgattc 
1702 



ctctctcctc 
ttaaaattgg 
atatctgggt 
tgtaccgctt 
ttgttttgtg 
ggtggaattg 
caagt^gttc 
atcaaaccac 
ctgcctggga 
ttggtctatg 
cacattgaag 
tttcgtagaa 
ttcaatggga 
agccgtgagc 
taaatcttct 
tgtacacatc 
atggtgtcga 

cactttcgga 

agtctcttgg 

tctatcatca 

t.atgcagatc 

aggaaagccc 

aaagtatata 

tcagaatgtc 

tgaacctaca 

cgtgattccg 

aaatttcttt 

ttgggtttat 

gagtgctctg 



ctgcaattct 
aatggaat.cg 
aatcttgaat 
ggcttgctct 
ctcgtgcggc 
ttgtggctt'g 
acagatccCg 
agaagtgata 
agcacactgg 
tggttttctg 
ttaggtcttc 
ggaacacgat 
ttgccagttc 
catatgcgtt 
tcgcagccta 
aagcgccgct 
gacacattcg 

gatgagtatc 

gcattgattc 

tggaaggggg 

ttaatccaat 

aagaacatgg 

tggaccccaa 

aaat:atagt.t 

tacttggatc 

ggaccgatcc 

aatgatgtac 

atcgtggtaa 

aa 



acttgctttc 
tttaaaaata 
ttgttggtga 
tcttcttctc 
cactgtcaaa 
agctgatatg 
ataccttticg 
ttgattggct 
ctgtcatgaa 
agtacct:tt.t 
aacgcctcaa 
ttacccaagc 
ctagaaatac 
cgtttgtccc 
caatgctcag 
cgat:gaaaga 
tcgcaaagga 

tgcaggacac 

tcatcctggg 

tcgccttctc 

tttctcaatc 

tiatcagaacc 

ctaagaegat 

tgagaaacaa 

tgtcgtcgcc. 

cggatcttag 

cggaattata 

atccttgtat 



tacgatottt 
tgatcttttt 
ggccatgggg 
tggtctcttc 
gnnCacatac 
gctcgtagat 
gctaatgggC 
tgttggatgg 
gaaatcatca 
tcttgagaga 
ggactaccct 
taaactttta 
tttgatccct 
ggccatatat 
acttttcaaa 
tctccctgaa 
tgcactcctg 

tggccggcct 

aggtttgaaa 

agccgcatgc 

cgagcgctcg 

cacggaaacg 

tcagacgcaa 

aagatcaaga 

accgtctgct 

ccttctatgc 

atgctagtta 

tgtttataag 



ccctctctct 
gtaatt;gaat: 
atcccagctg 
atcaactitca 
agaaggatta 
tggtgggcaa 
aaagagcatg 
gtgttggccc 
aagtttctcc 
agctgggcca 
ctgcctttct 
gcagcticaag 
cgtactaagg 
gatgtaacgg 
ggccagccat 
gcagcagatg 
gacaagcata 

ttgaaatctc 

ttcctacgat 

cttgtgctcg 

actcctgcta 

caacgacata 

gccacagttg 

ttagctgatg 

gctagctcgt 

atggattatg 

attaggggga 

atttgaagaa 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



<210> 162 
<211> 387 
<212> PRT 

<213> Simmondsia chinensis 



<400> 162 
Met Gly lie Pro 
1 

Phe Phe Ser Gly 
20 

Leu Val Arg Pro 
35 

Val Glu Leu Leu 
50 



Ala Ala Ala Val 
5 

Leu Phe lie Asn 

Leu Ser Lys Thr 
40 

Trp Leu Glu Leu 

55 



lie Val Pro Leu 
10 

Phe lie Gin Ala 
25 

Tyr Arg Arg lie 



lie Trp Leu Val 

60 



Gly Leu Leu Phe 
15 

He Cys Phe Val 
30 

Asn Arg Val Leu 
45 

Asp Trp Trp Ala 



Ser Val Lys He Lys Leu Phe Thr Asp Pro Asp Thr Phe Arg Leu Met 

65 70 75 80 

Gly Lys Glu His Ala Leu Val lie Ser Asn His Arg Ser Asp He Asp 

85 90 95 



Trp Leu Val Gly Trp Val Leu Ala Gin Arg Ser Gly Cys Leu Gly Ser 
100 105 110 
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Thr Leu Ala Val Met Lys Lys Ser Ser Lys Phe Leu Pro Val lie Gly 
115 120 125 

Trp Ser Met Trp Phe Ser Glu Tyr Leu Phe Leu Glu Arg Ser Trp Ala 
130 135 140 

Lys Asp Glu Ser Thr Leu Lys Leu Gly Leu Glri Arg Leu Lys Asp Tyr. 
145 150 155 160 

Pro Leu Pro Phe Trp Leu Ala Leu Phe Val Glu Gly Thr Arg Phe Thr 
165 170 175 

Gin Ala Lys Leu Leu Ala Ala Gin Glu Tyr Ala Thr Ser Met Gly Leu 
180 185 190 

Pro Val Pro Arg Asn Thr Leu lie Pro Arg Thr Lys Gly Phe Val Ser 
195 200 205 

Ala Val Ser His Met Arg Ser Phe Val Pro Ala lie Tyr Asp Val Thr 
210 215 220 

Val Ala lie Pro Lys Ser Ser Ser Gin Pro Thr Met Leu Arg Leu Phe 
225 230 235 240 

Lys Gly Gin Pro Ser Thr Val His Val His He Lys Arg Arg Ser Met 
245 250 255 

Lys Asp Leu Pro Glu Ala Ala Asp Asp Val Ala Gin Trp Cys Arg Asp 

260 265 270 

Thr Phe Val Ala Lys Asp Ala Leu Leu Asp Lys His Asn Val Asp Asp 
275 280 285 

Thr Phe Gly Asp Glu Tyr Leu Gin Asp Thr Gly Arg Pro Leu Lys Ser 
290 295 300 

Leu Phe Val Ala Val Ser Trp Ala Leu He Leu He Leu Gly Gly Leu 
305 310 315 320 

Lys Phe Leu Arg Trp Ser Ser Leu Leu Ser Ser Trp Lys Gly Val Ala 
325 330 335 

Phe Ser Ala Ala Cys Leu Val Leu Val Thr He Leu Met Gin He Leu 
340 345 350 

He Gin Phe Ser Gin Ser Glu Arg Ser Thr Pro Ala Lys Val Ala Pro 
355 360 365 

Gly Lys Pro Lys Asn Met Val Ser Glu Pro Thr Glu Thr Gin Arg His 
370 375 380 

Lys Gin His 
385 



<210> 163 
<211> 43 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 163 

aagcttgcat gcgtcgacac aatggttcat gcgaccaagt cag 43 



<210> 164 
<211> 35 
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<2i2> raaA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 164 

ggtaccgtcg actcacttct tggtgttgtt gatag 

<210> 165 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 165 

ggatccgcgg ccgcacaatg acgagcttta ctacttccct teat 

<210> 166 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 166 

ggatcccctg caggttagag atccattgat tctgcaat 

<210> 167 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 167 

ggatccgcgg ccgcataatg gaatcagagc tcaaagat 

<210> 168 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 168 

ggatcccctg caggtcattc ttctttctga tggaaatc 

<210> 169 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 169 

ggatccgcgg ccgcacaatg actcgttcac aagatgtttc a 
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<210> 170 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 170 

ggatcccctg caggtcactt ctcttccaat ctagccag 

<210> 171 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 171 

ggatccgcgg ccgcacaatg tccggtaata agatctcgac tcttca 

<210> 172 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 172 

ggatcccctg caggttattt tttcttgaca actccgttat taccgg 

<210> 173 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 
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<400> 173 

atatccgcgg ccgcacaatg gttatggagc aagctggaa 

<210> 174 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 174 

ggatcccctg caggtcaatg gagacaaggc tcgaaagt 

<210> 175 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 175 



39 



38 
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ggatccgcgg ccgcacaatg tccgccaaga tttcaatatt cc 

<210> 176 
<211> 38 
<212> DNA 

<213> Artificial. Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 176 

ggatcccctg caggttaatt tttcttaact actccatt 

<210> 177 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 177 

ggatccgcgg ccgcacaatg ggagcteagg agaaacggcg cc 

<210> 178 

<211> 38 

<212> DNA 

<213> Artificial Sequence • 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 178 

ggatcccctg caggtcacgt cttctccttc ttcaccgg • 

<210> 179 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 179 

ggatccgcgg ccgcacaatg gcggatcctg atctgtcttc tcct 

<210> 1.80 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 180 

ggatcccctg caggttatgt tggggccaag tcaggtgcaa agat 

<210> 181 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 
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<400> 181 

ggatccgcgg ccgcaaaatg gaaaaaaaga gtgtaccaaa ttct 

<210> 182 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 182 

ggatcccctg caggttattt gtttactaat ttgagggaat tttttg 

<210> 183 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
01 igonucleot ide 

<400> 183 

tcgacctgca ggaagcttaa ggatggtgat tgctgc 

<210> 184 

<211> 31 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 



<400> 184 

ggatccgcgg ccgcttactt ctccttctcc g 

<210> 185 
<211> 39 
<212> DNA 

<213> Artificial Sequence 



31 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 185 

ggatccgcgg ccgcacaatg tcttttaggg atgtcctag 

<210> 186 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 186 

ggatcccctg caggtcaatc atccttaccc tttggtttac c 

<210> 187 
<211> 60 
<212> DNA 

<213> Artificial Sequence 



<220> 
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<223> Description of Artificial Se<3uence : Synthetic 
Oligonucleotide 

<400> 187 

atgtctttta gggatgtcct agaaagagga gatgaatttt ctgtgcggta tttcacaccg 60 

<210> 188 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 188 

tcaatcatcc ttaccctttg gtttaccctc tggaggcaga .agattgtact gagagtgcac 60 

<210> 189 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 189 

ggatccgcgg ccgcacaatg aagcattccc" aaaaataccg tagg 44 

<210> 190 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 190 

ggatcccctg caggtcaatg attttttttc atcacaaata c 41 

<210> 191 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 191 

atgaagcatt cccaaaaata ccgtaggtat ggaatttatg ctgtgcggta tttcacaccg 60 

<210> 192 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 192 

tcaatgattt tttttcatca caaatacaag aataagaaaa agattgtact gagagtgcac 60 

<210> 193 
<211> 43 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 193 

ggatccgcgg ccgcacaatg ggttttgttg atttcttcga aac 43 

<210> 194 
<211> 45 
<212> DNA. 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 194 

ggatcccctg caggttattt ggtctcaatt ttaatatttt tttgc 45 

<210> 195 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 195 

atgggttttg ttgatttctt cgaaacatat atggtcggtt ctgtgcggta tttcacaccg 60 

<210> 196 
<211> 60 
<212> DNA 

<213> Artificial. Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 196 

ttatttggtc tcaattttaa tatttttttg caaggactcg agattgtact gagagtgcac 60 

<210> 197 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 197 

ggatccgcgg ccgcacaatg gaaaagtaca ccaattggag agac 44 

<210> 198 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 198 

ggatcccctg caggctactt cctcttttta cgttgatcgc tg 42 

<210> 199 
<211> 60 
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<212> DNA 

<213> Artificial SeG[uence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 199 

atggaaaagt acaccaattg gagagacaat ggtacgggaa ctgtgcggta tttcacaccg 60 

<210> 200 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 200 

ctacttcctc tttttacgtt gatcgctgat atattccttc agattgtact gagagtgcac 60 

<210> 201 
<211> 41 
<212> DNA 

<213> Artificial Sequence. 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 201 

ggatccgcgg ccgcacaatg cctgcaccaa aactcacgga g 41 

<210> 202 
<211> 38 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 202 

ggatcccctg caggctacgc atctccttct ttcccttc 38 

<210> 203 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 203 

atgcctgcac caaaactcac ggagaaatct gcctcttcca ctgtgcggta tttcacaccg 60 

<210> 204 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 204 

ctacgcatct ccttctttcc cttcttcttc ttcttcctct agattgtact gagagtgcac 60 
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<210> 205 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 205 

ggatccgcgg ccgcacaatg tctgctcccg ctgccgatca taacgc 46 

<210> 206 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 206 

ggatcccctg caggtcattc tttcttttcg tgttctcttt tctg 44 

<210> 207 
<211> 60 
<212> DNA 

<213> Artificial Sequence 

<220> . ' 

<223> Description of Artificial Sequencfe : Synthetic 
Ol igonucleot ide 

<400> 207 

atgtctgctc ccgctgccga tcataacgct gccaaaccta ctgtgcggta tttcacaccg 60 

<210> 208 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 208 

tcattctttc ttttcgtgtt ctcttttctg tcttaccagc agattgtact gagagtgcac 60 

<210> 209 
<211> 49 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 209 

ggatccgcgg ccgcacaatg ctgcatcaaa aaatagctca taaagttcg 49 

<210> 210 

<211> 49 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Ol igonucleot ide 

<400> 210 
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ggatcccctg caggtcaaaa aataaaacaa taaagttitat aaactaacc 49 

<210> 211 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 211 

atgctgcatc aaaaaatagc tcataaagtt cgaaaagtcg ctgtgcggta tttcacaccg 60 

<210> 212 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 212 

tcaaaaaata aaacaataaa gtttataaac taaccaaatt agattgtact gagagtgcac 60 

<210> 213 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 213 

ggatccgcgg ccgcacaatg agtgtgatag gtaggttctt g 41 

<210> 214 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 214 

ggatcccctg caggttaatg catctttttt acagatgaac c 41 

<210> 215 

<211> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 215 

atgagtgtga taggtaggtt cttgtattac ttgaggtccg ctgtgcggta tttcacaccg 60 

<210> 216 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 
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<400> 216 

ttaatgcatc ttttttacag atgaaccttc gttatgggta agattgtact gagagtgcac 60 

<210> 217 
<211> 381 
<212> PRT 

<213> Saccharomyces sp . 

<220> 

<400> 217 

Met Ser Phe Arg Asp Val Leu Glu Arg Gly Asp Glu Phe Leu Glu Ala 
15 10 15 

Tyr Pro Arg Arg Ser Pro Leu Trp Arg Phe Leu Ser Tyr Ser Thr Ser 
20 25 30 

Leu Leu Thr Phe Gly Val Ser Lys Leu Leu Leu Phe Thr Cys Tyr Asn 
35 40 45 

Val Lys Leu Asn Gly Phe Glu Lys Leu Glu Thr Ala Leu Glu Arg Ser 
50 55 60 

Lys Arg Glu Asn Arg Gly Leu Met Thr Val Met Asn His Met Ser Met 
65 70 75 80 

Val Asp Asp Pro Leu Val Trp Ala Thr Leu Pro Tyr Lys Leu Phe Thr 
85 90 95 

Ser Leu Asp Asn lie Arg Trp Ser Leu Gly Ala His Asn lie Cys Phe 
100 105 110 

Gin Asn Lys Phe Leu Ala Asn Phe Phe Ser Leu Gly Gin Val Leu Ser 
115 120 125 

Thr Glu Arg Phe Gly Val Gly Pro Phe Gin Gly Ser He Asp Ala Ser 
130 135 140 

He Arg Leu Leu Ser Pro Asp Asp Thr Leu Asp Leu Glu Trp Thr Pro 
145 150 155 160 

His Ser Glu Val Ser Ser Ser Leu Lys Lys Ala Tyr Ser Pro Pro He 
165 170 175 

He Arg Ser Lys Pro Ser Trp Val His Val Tyr Pro Glu Gly Phe Val 
180 185 190 

Leu Gin Leu Tyr Pro Pro Phe Glu Asn Ser Met Arg Tyr Phe Lys Trp 
195 200 205 

Gly He Thr Arg Met He Leu Glu Ala Thr Lys Pro Pro He Val Val 
210 215 220 

Pro He Phe Ala Thr Gly Phe Glu Lys He Ala Ser Glu Ala Val Thr 

225 230 235 240 

Asp Ser Met Phe Arg Gin He Leu Pro Arg Asn Phe Gly Ser Glu He 
245 250 255 

Asn Val Thr He Gly Asp Pro Leu Asn Asp Asp Leu He Asp Arg Tyr 
260 265 270 

Arg Lys Glu Trp Thr His Leu Val Glu Lys Tyr Tyr Asp Pro Lys Asn 
275 280 285 

Pro Asn Asp Leu Ser Asp Glu Leu Lys Tyr Gly Lys Glu Ala Gin Asp 
290 295 300 

Leu Arg Ser Arg Leu Ala Ala Glu Leu Arg Ala His Val Ala Glu He 
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305 310 315 320 

Arg Asn Glu Val Arg Lys Leu Pro Arg <31u Asp Pro Arg Phe Lys Ser 
325 330 335 

Pro Ser Trp Trp Lys Arg Phe Asn Thr Thr Glu Gly Lys Ser Asp Pro 
340 345 350 

Asp Val Lys Val lie Gly Glu Asn Trp Ala lie Arg Arg Met Gin Lys 
355 360 365 

Phe Leu Pro Pro Glu Gly Lys Pro Lys Gly Lys Asp Asp 
370 375 380 

<210> 218 

<211> 396 

<212> PRT 

<213> Saccharontyces sp. 

<220> 

<400> 218 

Met Lys His Ser Gin Lys Tyr Arg Arg Tyr. Gly lie Tyr Glu Lys Thr 
1 5 10 15 

Gly Asn Pro Phe He Lys Gly Leu Gin Arg Leu Leu He Ala Cys Leu 
20 * 25 30 

Phe He Ser Gly Ser Leu Ser He Val Val Phe Gin He Cys Leu Gin 
35 40 45 

Val Leu Leu Pro Trp Ser Lys He Arg Phe Gin Asn Gly He Asn ^In 
50 55 60 

Ser Lys Lys Ala Phe He Val Leu Leu Cys Met He Leu Asn Met Val 
65 70 75 80 

Ala Pro Ser Ser Leu Asn Val Thr Phe Glu Thr Ser Arg Pro Leu Lys 
85 90 95 

Asn Ser Ser Asn Ala Lys Pro Cys Phe Arg Phe Lys Asp Arg Ala He 
100 105 110 

He He Ala Asn His Gin Met Tyr Ala Asp Trp He Tyr Leu Trp Trp » 
115 120 125 

Leu Ser Phe Val Ser Asn Leu Gly Gly Asn Val Tyr He He Leu Lys 
130 135 140 

Lys Ala Leu Gin Tyr He Pro Leu Leu Gly Phe Gly Met Arg Asn Phe 
145 150 155 160 

Lys Phe He Phe Leu Ser Arg Asn Trp Gin Lys Asp Glu Lys Ala Leu 
165 170 175 

Thr Asn Ser Leu Val Ser Met Asp Leu Asn Ala Arg Cys Lys Gly Pro 
180 185 190 

Leu Thr Asn Tyr Lys Ser Cys Tyr Ser Lys Thr Asn Glu Ser He Ala 
195 200 205 

Ala Tyr Asn Leu He Met Phe Pro Glu Gly Thr Asn Leu Ser Leu Lys 
210 215 220 

Thr Arg Glu Lys Ser Glu Ala Phe Cys Gin Arg Ala His Leu Asp His 
225 230 235 240 

Val Gin Leu Arg His Leu Leu Leu Pro His Ser Lys Gly Leu Lys Phe 
245 250 255 
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Ala Val Glu Lys Leu Ala Pro Ser Leu Asp Ala lie Tyr Asp Val Thr 
260 265 270 

lie Gly Tyr Ser Pro Ala Leu Arg Thr Glu Tyr Val Gly Thr Lys Phe 
275 280 285 

Thr Leu Lys Lys lie Phe Leu Met Gly Val Tyr Pro Glu Lys Val Asp 
290 295 300 

Phe Tyr lie Arg Glu Phe Arg Val Asn Glu lie Pro Leu Gin Asp Asp 
305 310 315 320 

Glu Val Phe Phe Asn Trp Leu Leu Gly Val Trp Lys Glu Lys Asp Gin 
325 330 335 

Leu Leu Glu Asp Tyr Tyr Asn Thr Gly Gin Phe Lys Ser Asn Ala Lys 
340 345 350 

Asn Asp Asn Gin Ser lie Val Val Thr Thr Gin Thr Thr Gly Phe Gin 
355 360 365 

His Glu Thr Leu Thr Pro Arg lie Leu Ser Tyr Tyr Gly Phe Phe Ala 
370 375 380 

Phe Leu lie Leu Val Phe Val Met Lys Lys Asn His 
385 390 395 

<210> 219 

<211> 479 

<212> PRT 

<213> Saccharomyces sp. 
<220> 

<400> 219 

Met Gly Phe Val Asp Phe Phe Glu Thr Tyr Met Val Gly Ser Arg Val 
15 10 15 

Gin Phe Lys Gin Leu Asp lie Ser Asp Trp Leu Ser Leu Thr Pro Arg 
20 25 30 

Leu Leu lie Leu Phe Gly Tyr Phe Tyr Leu His Ser Phe Phe Thr Ala 
35 40 45 

He Asn Gin Phe Leu Gin Phe He Asn Thr Asn Ser Phe Cys Leu Arg 
50 55 60 

Leu His Leu Leu Tyr Asp Arg Phe Trp Ser His Val Pro He He Gly 
65 70 75 80 

Glu Tyr Lys He Arg Leu Leu Ser Arg Ala Leu Thr Tyr Ser Lys Leu 
85 90 95 

Lys He He Pro Thr Leu Asp Lys Val Leu Glu Ala He Glu He Trp 

100 105 110 

Phe Gin Leu His Leu Val Glu Met Thr Phe Glu Lys Lys Lys Asn Val 
115 120 125 

Gin He Phe He Thr Glu Gly Ser Asp Asp Leu Asn Phe Phe Lys Asp 
130 135 140 

Ser Lys Phe Gin Thr Thr Leu Met He Cys Asn His Arg Ser Val Asn 
145 150 155 160 

Asp Tyr Thr Leu He Asn Tyr Leu Phe Leu Lys Ser Cys Pro Thr Lys 
165 170 175 
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Phe Tyr Thr LysTrp Glu Phe Leu Gin Lys Leu Arg Lys Gly Glu Asp 
180 185 190 

Leu Ala Glu Trp Pro Gin Leu Lys Phe Leu Gly Trp Gly Lys Met Phe • 
195 200 205 

Asn Phe Pro Arg Leu Asp Leu Leu Lys As'n. He Phe Phe Lys Asp Glu 
210 215 220 

Thr Leu Ala Leu Ser Ser Asn Glu Leu Arg Asp He Leu Glu Arg Gin 
225 230 -235 ' 240 

Asn Asn Gin Ala He Thr He Phe Pro. Glu Val Asn He Met Ser Leu 
245 250 255 

Glu Leu Ser He He Gin Arg Lys Leu His Gin Asp Phe Pro Phe Val 
260 265 270 

He Asn Phe Tyr Asn Leu Leu Tyr Pro Arg Phe Lys Asn Phe Thr Thr 
275 280 285 

Leu Met Ala Ala Phe Ser Ser He Lys Asn He Lys Arg Lys Lys Asn 
290 295 300 

Arg Asn Asn He He Lys Glu Ala Arg Tyr Leu Phe His Arg Glu Leu 
305 310 315 320 

Asp Lys Leu Val His Lys Ser Met Lys- Met Gl:u Ser Ser Lys Val Ser 
325 330; 335 

Asp Lys Thr Thr Pro Pro Met He Val Asp Asn Ser Tyr Leu Leu Thr 
340 345 350 

Lys Lys Glu Glu He Ser Ser Gly Lys Pro Lys Val Val Arg He Asn 
355 360 365 

Pro Tyr He Tyr Asp Val Thr He He Tyr Tyr Arg Val Lys Tyr Thr 
370 375 380 

Asp Ser Gly His Asp His Thr Asn Gly Asp Leu Arg Leu His Lys Gly 
385 390 395 400 

Tyr Gin Leu Glu Gin He Ser Pro Thr He Phe Glu Met He Gin Pro 
405 410 415 

Glu Met Glu Ser Glu Asn Asn He Lys Asp Lys Asp Pro He Val Val 
420 425 430 

Met Val Asn Val Lys Lys His Gin He Gin Pro Leu Leu Ala Tyr Asn 
435 440 445 

Asp gIu Ser Leu Glu Lys Trp Leu gIu Asn Arg Trp He Glu Lys Asp 
450 455 460 

Arg Leu He Glu Ser Leu Gin Lys Asn He Lys He Glu Thr Lys 
465 470 475 

<210> 220 
<211> 300 
<212> PRT 

<213> Saccharoinyces sp. 
<400> 220 

Met Glu Lys Tyr Thr Asn Trp Arg Asp Asn Gly Thr Gly He Ala Pro 
15 10 15 



Phe Leu Pro Asn Thr He Arg Lys Pro Ser Lys Val Met Thr Ala Cys 
20 25 30 



f * t 
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Leu Leu Gly He Leu Gly Val Lys Thr He He Met Leu Pro Leu He 
35 40 45 

Met Leu Tyr Leu Leu Thr Gly Gin Asn Asn Leu Leu Gly Leu He Leu 
50 55 60 

Lys Phe Thr Phe Ser Trp Lys Glu Glu He Thr Val Gin Gly He Lys. 
65 70 75 80 

Lys Arg Asp Val Arg Lys Ser Lys His Tyr Pro Gin Lys Gly Lys Leu 
85 90 95 

Tyr He Cys Asn Cys Thr Ser Pro Leu Asp Ala Phe Ser Val Val Leu 
100 105 110 

Leu Ala Gin Gly Pro Val Thr Leu Leu Val Pro Ser Asn Asp He Val 
115 120 125 

Tyr Lys Val Ser He Arg Glu Phe He Asn Phe He Leu Ala Gly Gly 
130 135 140 

Leu Asp He Lys Leu Tyr Gly His Glu Val Ala Glu Leu Ser Gin Leu 
145 150 155 160 

Gly Asn Thr Val Asn Phe Met Phe Ala Glu Gly Thr Ser Cys Asn Gly 
165 170 175 

Lys Ser Val Leu Pro Phe Ser He Thr Gly Lys Lys Leu Lys Glu Phe 
180 185 190 

He Asp Pro Ser He Thr Thr Met Asn Pro Ala Met Ala Lys Thr Lys 
195 200 205 

Lys Phe Glu Leu Gin Thr He Gin He Lys Thr Asn Lys Thr Ala He 

210 215 220 

Thr Thr Leu Pro He Ser Asn Met Glu Tyr Leu Ser Arg Phe Leu Asn 
225 230 235 240 

Lys Gly He Asn Val Lys Cys Lys He Asn Glu Pro Gin Val Leu Ser 

245 250 255 

Asp Asn Leu Glu Glu Leu Arg Val Ala Leu Asn Gly Gly- Asp Lys Tyr 
260 265 270 

Lys Leu Val Ser Arg Lys Leu Asp Val Glu Ser Lys Arg Asn Phe Val 
275 280 285 

Lys Glu Tyr He Ser Asp Gin Arg Lys Lys Arg Lys 
290 295 300 



<210> 221 
<211> 759 
<212> PRT 

<213> Saccharomyces sp. 

<400> 221 

Met Pro Ala Pro Lys Leu Thr Glu Lys Phe Ala Ser Ser Lys Ser Thr 
15 10 15 

Gin Lys Thr Thr Asn Tyr Ser Ser He Glu Ala Lys Ser Val Lys Thr 

20 25 30 

Ser Ala Asp Gin Ala Tyr He Tyr Gin Glu Pro Ser Ala Thr Lys Lys 
35 40 45 

He Leu Tyr Ser He Ala Thr Trp Leu Leu Tyr Asn He Phe His Cys 
50 55 60 
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Phe Phe Arg Glu He Arg Gly Arg Gly Ser Phe Lys Val Pro Gin Gin 
65 70 75 80 

Gly Pro Val He Phe Val Ala Ala Pro His Ala Asn Gin Phe Val Asp * 
85 90 95 

Pro Val lie Leu Met Gly Glu Val Lys Lys Ser Val Asn Arg Arg Val 
100 105 110 

Ser Phe Leu He Ala Glu Ser Ser Leu Lys Gin Pro Pro He Gly Phe 
115 120 125 

Leu Ala Ser Phe Phe Met Ala He Gly. Val Val Arg Pro Gin Asp Asn 
130 135 140 

Leu Lys Pro Ala Glu Gly Thr lie Arg Val Asp Pro Thr Asp Tyr Lys 
145 150 155 160 

Arg Val He Gly His Asp Thr His Phe Leu . Thr Asp Cys Met Pro Lys 
165 170 175 

Gly Leu He Gly Leu Pro Lys Ser Met Gly Phe Gly Glu He Gin Ser 
180 185 190 

He Glu Ser Asp Thr Ser Leu Thr Leu Arg Lys Glu Phe Lys Met Ala 
195 200 205 

Lys Pro Glu He Lys Thr Ala Leu Leu Thr. Gly Thr Thr Tyr Lys Tyr 
210 215 ; 220 

Ala Ala Lys Val Asp Gin Ser Cys Val Tyr His Arg Val Phe Glu His 
225 230 235 240 

Leu Ala His Asn Asn Cys He Gly lie Phe Pro Glu Gly Gly Ser His 
•245 250 255 

Asp Arg Thr Asn Leu Leu Pro Leu Lys Ala Gly Val Ala He Met Ala 
260 265 270 

Leu Gly Cys Met Asp Lys His Pro Asp Val Asn Val Lys He Val Pro 
275 280 285 

Cys Gly Met Asn Tyr Phe His Pro His Lys Phe Arg Ser Arg Ala Val 
290 295 300 

Val Glu Phe Gly Asp Pro He Glu He Pro Lys Glu Leu Val Ala Lys 
305 310 315 320 

Tyr His Asn Pro Glu Thr Asn Arg Asp Ala Val Lys Glu Leu Leu Asp 
325 330 335 

Thr He Ser Lys Gly Leu Gin Ser Val Thr Val Thr Cys Ser Asp Tyr 
340 345 350 

Glu Thr Leu Met Val Val Gin Thr He Arg Arg Leu Tyr Met Thr Gin 
355 360 365 

Phe Ser Thr Lys Leu Pro Leu Pro Leu He Val Glu Met Asn Arg Arg 
370 375 380 

Met Val Lys Gly Tyr Glu Phe Tyr Arg Asn Asp Pro Lys He Ala Asp 
385 390 395 400 

Leu Thr Lys Asp He Met Ala Tyr Asn Ala Ala Leu Arg His Tyr Asn 
405 410 415 

Leu Pro Asp His Leu Val Glu Glu Ala Lys Val Asn Phe Ala Lys Asn 
420 425 430 
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Leu Gly Leu Val Phe Phe Arg Ser lie Gly Leu Cys lie Leu Phe Ser 
435 440 445 

Leu Ala Met Pro Gly He He Met Phe Ser Pro Val Phe He Leu Ala 
450 455 . 460 

Lys Arg He Ser Gin Glu Lys Ala Arg Thr Ala Leu Ser Lys Ser Thr 
465 470 475 480 

Val Lys He Lys Ala Asn Asp Val He Ala Thr Trp Lys He Leu He 
485 490 495 

Gly Met Gly Phe Ala Pro Leu Leu Tyr He Phe Trp Ser Val Leu He 
500 505 510 

I 

Thr Tyr Tyr Leu Arg His Lys Pro Trp Asn Lys He Tyr Val Phe Ser 
515 520 525 

Gly Ser Tyr He Ser Cys Val He Val Thr Tyr Ser Ala Leu He Val 
530 535 540 

Gly Asp He Gly Met Asp Gly Phe Lys Ser Leu Arg Pro Leu Val Leu 
545 550 555 560 

Ser Leu Thr Ser Pro Lys Gly Leu Gin Lys Leu Gin Lys Asp Arg Arg 
565 570 575 

Asn Leu Ala Glu Arg He He Glu Val Val Asn Asn Phe Gly Ser Glu 
580 585 590 

Leu Phe Pro Asp Phe Asp Ser Ala Ala Leu Arg Glu Glu Phe Asp Val 
595 600 605 

He Asp Glu Glu Glu Glu Asp Arg Lys Thr Ser Glu Leu Asn Arg Arg 
610 615 620 

Lys Met Leu Arg Lys Gin Lys He Lys Arg Gin Glu Lys Asp Ser Ser 
625 630 635 640 

Ser Pro He He Ser Gin Arg Asp Asn His Asp Ala Tyr Glu His His 
645 650 655 

Asn Gin Asp Ser Asp Gly Val Ser Leu Val Asn Ser Asp Asn Ser Leu 
660 665 670 

Ser Asn He Pro Leu Phe Ser Ser Thr Phe His Arg Lys Ser Glu Ser 
675 680 685 

Ser Leu Ala Ser Thr Ser Val Ala Pro Ser Ser Ser Ser Glu Phe Glu 
690 695 700 

Val Glu Asn Glu He Leu Glu Glu Lys Asn Gly Leu Ala Ser Lys He 
705 710 715 720 

Ala Gin Ala Val Leu Asn Lys Arg He Gly Glu Asn Thr Ala Arg Glu 
725 730 735 

Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu 
740 745 750 

Glu Gly Lys Glu Gly Asp Ala 
755 

<210> 222 
<211> 743 
<212> PRT 

<213> Saccharomyces sp. 



<400> 222 
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Met Ser Ala Pro Ala Ala Asp His Asn Ala Ala Lys Pro lie Pro His 
1 5 10 15 

Val Pro Gin Ala Ser Arg Arg Tyr Lys Asn Ser Tyr Asn Gly Phe Val 
20 25 • 30 

Tyr Asn lie His Thr Trp Leu Tyr Asp Val Ser Val Phe Leu Phe Asn 
35 40 45 

lie Leu Phe Thr lie Phe Phe Arg Glu lie Lys Val Arg Gly Ala Tyr 
50 55 60 

Asn Val Pro Glu Val Gly Val Pro Thr lie Leu Val Cys Ala Pro His 
65 70 75 . 80 

Ala Asn Gin Phe lie Asp Pro Ala Leu Val Met Ser Gin Thr Arg Leu 
85 90 95 

Leu Lys Thr Ser Ala Gly Lys Ser Arg Ser Arg Met Pro Cys Phe Val 
100 105 110 

Thr Ala Glu Ser Ser Phe Lys Lys Arg Phe lie Ser Phe Phe Gly His 
115 120 125 

Ala Met Gly Gly lie Pro Val Pro Arg lie Gin Asp Asn Leu Lys Pro 

130 135 140 

Val Asp Glu Asn Leu Glu lie Tyr Ala Pro Asp Leu Lys Asn His Pro 
145 150 155 160 

Glu lie lie Lys Gly Arg Ser Lys Asn Pro Gin Thr Thr Pro Val Asn 
165 170 175 

Phe Thr Lys Arg Phe Ser Ala Lys Ser Leu Leu Gly Leu Pro Asp. Tyr 
180 185 190 

Leu Ser Asn Ala Gin lie Lys Glu lie Pro Asp Asp Glu Thr lie lie 
195 200 205 

Leu Ser Ser Pro Phe Arg Thr Ser Lys Ser Lys Val Val Glu Leu Leu 
210 215 220 

Thr Asn Gly Thr Asn Phe Lys Tyr Ala Glu Lys lie Asp Asn Thr Glu 
225 230 235 240 

Thr Phe Gin Ser Val Phe Asp His Leu His Thr Lys Gly Cys Val Gly 
245 250 255 

lie Phe Pro Glu Gly Gly Ser His Asp Arg Pro Ser Leu Leu Pro He 
260 265 270 

Lys Ala Gly Val Ala He Met Ala Leu Gly Ala Val Ala Ala Asp Pro 
275 280 285 

Thr Met Lys Val Ala Val Val Pro Cys Gly Leu His Tyr Phe His Arg 
290 295 300 

Asn Lys Phe Arg Ser Arg Ala Val Leu Glu Tyr Gly Glu Pro He Val 
305 310 315 320 

Val Asp Gly Lys Tyr Gly Glu Met Tyr Lys Asp Ser Pro Arg Glu Thr 

325 330 335 

Val Ser Lys Leu Leu Lys Lys He Thr Asn Ser Leu Phe Ser Val Thr 
340 345 350 

Glu Asn Ala Pro Asp Tyr Asp Thr Leu Met Val lie Gin Ala Ala Arg 
355 360 365 

Arg Leu Tyr Gin Pro Val Lys Val Arg Leu Pro Leu Pro Ala He Val 
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370 375 380 

Glu lie Asn Arg Arg Leu Leu Phe Gly Tyr Ser Lys Phe Lys Asp Asp 
385 390 395 400 

Pro Arg He He His Leu Lys Lys Leu Val Tyr Asp Tyr Asn Arg Lys 
405 410 415 

Leu Asp Ser Val Gly Leu Lys Asp His Gin Val Met Gin Leu Lys Thr 
420 425 430 

Thr Lys Leu Glu Ala Leu Arg Cys Phe Val Thr Leu He Val Arg Leu 
435 440 445 

He Lys Phe Ser Val Phe Ala He Leu Ser Leu Pro Gly Ser He Leu 
450 455 460 

Phe Thr Pro He Phe He He Cys Arg Val Tyr Ser Glu Lys Lys Ala 
465 470 475 480 

Lys Glu Gly Leu Lys Lys Ser Leu Val Lys He Lys Gly Thr Asp Leu 
485 490 495 

Leu Ala Thr Trp Lys Leu He Val Ala Leu He Leu Ala Pro He Leu 
500 505 510 

Tyr Val Thr Tyr Ser He Leu Leu He He Leu Ala Arg Lys Gin His 
515 520 525 

Tyr Cys Arg He Trp Val Pro Ser Asn Asn Ala Phe He Gin Phe Val 
530 535 540 

Tyr Phe Tyr Ala Leu Leu Val Phe Thr Thr Tyr Ser Ser Leu Lys Thr 
545 550 555 560 

Gly Glu He Gly Val Asp Leu Phe Lys Ser Leu Arg Pro Leu Phe Val 
565 570 575 

Ser He Val Tyr Pro Gly Lys Lys He Glu Glu He Gin Thr Thr Arg 
580 585 590 

Lys Asn Leu Ser Leu Glu Leu Thr Ala Val Cys Asn Asp Leu Gly Pro 
595 600 605 

Leu Val Phe Pro Asp Tyr Asp Lys Leu Ala Thr Glu He Phe Ser Lys 
610 615 620 

Arg Asp Gly Tyr Asp Val Ser Ser Asp Ala Glu Ser Ser He Ser Arg 
625 630 635 640 

Met Ser Val Gin Ser Arg Ser Arg Ser Ser Ser He His Ser He Gly 
645 650 655 

Ser Leu Ala Ser Asn Ala Leu Ser Arg Val Asn Ser Arg Gly Ser Leu 
660 665 670 

Thr Asp He Pro He Phe Ser Asp Ala Lys Gin Gly Gin Trp Lys Ser 
675 680 685 

Glu Gly Glu Thr Ser Glu Asp Glu Asp Glu Phe Asp Glu Lys Asn Pro 
690 695 700 

Ala He Val Gin Thr Ala Arg Ser Ser Asp Leu Asn Lys Glu Asn Ser 
705 710 715 720 

Arg Asn Thr Asn He Ser Ser Lys He Ala Ser Leu Val Arg Gin Lys 
725 730 735 

Arg Glu His Glu Lys Lys Glu 
740 
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<210> 223 
<211> 397 
<212> PRT 

<213> Saccharorayces sp. 
<400> 223 . 

Met Leu His Gin Lys He Ala His Lys Val Arg Lys Val Val Val Pro 
1 5 10 15 

Gly He Ser Leu Leu He Phe Phe Gin Gly Cys Leu He Leu Levi Phe 
20 25 30 

Leu Gin Leu Thr Tyr Lys Thr Leu Tyr Cys Arg Ash Asp He Arg Lys 
35 40 45 

Gin He Gly Leu Asn Lys Thr Lys Arg Leu Phe He Val Leu Val Ser 
50 55 60 

Ser He Leu His Val Val Ala Pro Ser Ala Val Arg He Thr Thr Glu 
65 70 75 80 

Asn Ser Ser Val Pro Lys Gly Thr Phe Phe Leu Asp Leu Lys Lys Lys 
85 • 90 . 95 

Arg He Leu Ser His Leu Lys Ser Asn Ser Val Ala He Cys Asn His 
100 105 110 

Gin He Tyr Thr Asp Trp He Phe Leu Trp'Trp Leu Ala Tyr Thr Ser 
115 120 125 

Asn Leu Gly Ala Asn Val Phe He He Leu Lys Lys Ser Leu Ala Ser 
130 135 .140 

He Pro He Leu Gly Phe Gly Met Arg Asn Tyr Asn Phe lie Phe Met 
145 150 155 160 

Ser Arg Lys Trp Ala Gin Asp Lys He Thr Leu Ser Asn Ser Leu Ala 
165 170 175 

Gly Leu Asp Ser Asn Ala Arg Gly Ala Gly Ser Leu Ala Gly Lys Ser 
180 185 190 

Pro Glu Arg He Thr Glu Glu Gly Glu Ser He Trp Asn Pro Glu Val 
195 200 205 

He Asp Pro Lys Gin He His Trp Pro Tyr Asn Leu He Leu Phe Pro 
210 215 220 

Glu Gly Thr Asn Leu Ser Ala Asp Thr Arg Gin Lys Ser Ala Lys Tyr 
225 230 235 240 

Ala Ala Lys He Gly Lys Lys Pro Phe Lys Asn Val Leu Leu Pro His 
245 250 255 

Ser Thr Gly Leu Arg Tyr Ser Leu Gin Lys Leu Lys Pro Ser He Glu 
260 265 270 

Ser Leu Tyr Asp He Thr He Gly Tyr Ser Gly Val Lys Gin Glu Glu 
275 280 285 

Tyr Gly Glu Leu He Tyr Gly Leu Lys Ser He Phe Leu Glu Gly Lys 
290 295 300 

Tyr Pro Lys Leu Val Asp He His He Arg Ala Phe Asp Val Lys Asp 
305 310 315 320 

He Pro Leu Glu Asp Glu Asn Glu Phe Ser Glu Trp Leu Tyr Lys He 
325 330 335 
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Trp Ser Glu Lys Asp Ala Leu Met Glu Arg Tyr Tyr Ser Thr Gly Ser 
340 345 350 

Phe Val Ser Asp Pro Glu Thr Asn His Ser Val Thr Asp Ser Phe Lys 
355 360 365 

lie Asn Arg lie Glu Leu Thr Glu Val Leu lie Leu Pro Thr Leu Thr. 
370 375 380 

lie lie Trp Leu Val Tyr Lys Leu Tyr Cys Phe lie Phe 
385 390 395 

<210> 224 
<211> 303 
<212> PRT 

<213> Saccharomyces sp. 
<400> 224 

Met Ser Val lie Gly Arg Phe Leu Tyr Tyr Leu Arg Ser Val Leu Val ' 
15 10 15 

Val Leu Ala Leu Ala Gly Cys Gly Phe Tyr Gly Val lie Ala Ser lie 
20 25 30 

Leu Cys Thr Leu lie Gly Lys Gin His Leu Ala Gin Trp He Thr Ala 
35 40 45 

Arg Cys Phe Tyr His Val Met Lys Leu Met Leu Gly Leu Asp Val Lys 
50 55 60 

Val Val Gly Glu Glu Asn Leu Ala Lys Lys Pro Tyr He Met He Ala 
65 70 75 80 

Asn His Gin Ser Thr Leu Asp He Phe Met Leu Gly Arg He Phe Pro 
85 90 95 

Pro Gly Cys Thr Val Thr Aia Lys Lys Ser Leu Lys Tyr Val Pro Phe 
100 105 110 

Leu Gly Trp Phe Met Ala Leu Ser Gly Thr Tyr Phe Leu Asp Arg Ser 
115 120 125 

Lys Arg Gin Glu Ala He Asp Thr Leu Asn Lys Gly Leu Glu Asn Val 
130 135 140 

Lys Lys Asn Lys Arg Ala Leu Trp Val Phe Pro Glu Gly Thr Arg Ser 
145 150 155 160 

Tyr Thr Ser Glu Leu Thr Met Leu Pro Phe Lys Lys Gly Ala Phe His 

165 170 175 

Leu Ala Gin Gin Gly Lys He Pro He Val Pro Val Val Val Ser Asn 
180 185 190 

Thr Ser Thr Leu Val Ser Pro Lys Tyr Gly Val Phe Asn Arg Gly Cys 
195 200 205 

Met He Val Arg He Leu Lys Pro He Ser Thr Glu Asn Leu Thr Lys 
210 215 220 

Asp Lys He Gly Glu Phe Ala Glu Lys Val Arg Asp Gin Met Val Asp 
225 230 235 240 

Thr Leu Lys Glu He Gly Tyr Ser Pro Ala He Asn Asp Thr Thr Leu 
245 250 255 

Pro Pro Gin Ala He Glu Tyr Ala Ala Leu Gin His Asp Lys Lys Val 
260 265 270 
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Asn Lys Lys He Lys Asn Glu Pro Val Pro Ser Val Ser He Ser Asn 
275 280 285 

Asp Val Asn Thr His Asn Glu Gly Ser Ser Val Lys Lys Met His 
290 295 300 



<210> 225 
<211> 1146 
<212> DNA 

<213> Saccharoinyces sp. 



<400> 225 

atgtctttta 

agcccccttt 

ctgcttcttt 

ttggaacgtt 

gtcgatgatc 

ataagatggt 

ttctcacttg 

atagatgctt 

cactctgagg 

ccatcttggg 

aattcgatga 

cccattgtag 

gattcaatgt 

ggggatcctt 

gaaaaatac t 

gaggcgcaag 

agaaatgaag 

1020 

aagcggttca 
1080 

tgggcaataa 
1140 
gattga 
1146 



<210> 226 
<211> 1191 
<212> DNA 

<213> Saccharoinyces sp. 



gggatgtcct 
ggagatttct 
tcacatgcta 
ccaaaaggga 
cgttagtttg 
ctt tgggtgc 
gccaagtcct 
caataagatt 
tctctccttc 
tccatgttta 
ggtatt ttaa 
taccaatatt 
ttagacaaat 
taaatgatga 
atgatcccaa 
atttaagaag 
ttcgcaaatt 



agaaagagga 
ttcatacagt 
taatgtcaaa 
aaatagaggc 
ggcaacacta 
acataatatt 
ttcaacagaa 
gttaagccct 
gctaaaaaaa 
tccagaagga 
atggggtatt 
tgctacaggg 
tctaccaaga 
tttaatcgac 
aaatcctaac 
cagattagcc 
accacgcgaa 



gatgaattt t 
acatcattac 
ttgaatggtt 
cttatgacgg 
ccatataagt 
tgctttcaaa 
agatttgggg 
gacgacactt 
gcctactccc 
tttgtactac 
accagaatga 
tttgaaaaaa 
aactttggct 
aggtatagaa 
gacctctctg 
gctgaactga 
gaccctaggt 



tagaagccta 
tgaccttcgg 
ttgaaaaatt 
tcatgaacca 
tatttacgtc 
ataaatttct 
tgggcccatt 
tagacttgga 
cgcccataat 
aattatatcc 
tcctagaagc 
tagcatccga 
ctgaaataaa 
aagaatggac 
acgaattgaa 
gagcccatgt 
tcaaatcccc 



tcd.cagaaga 

tgtatcaaaa 

agaaactgcc 

tatgagtatg 

tttggacaac 

ggccaacttt 

tcaaggttct 

atggacccct 

aaggtcgaag 

gccttttgaa 

aacaaagccg 

agcagtcaca 

tgttaccata 

acatttggtt 

atatggtaaa 

tgctgaaatt, 

ctcatggtgg 



€0 
120 
180 
240 
300 
360 
420 
480 
540 
600 
€60 
720 
780 
840 
SCO 
-960 



acaccacgga aggtaaatcg 
ggaggatgca aaagtttctg 



gacccagatg ttaaagtcat tggcgaaaat 
cctccagagg gtaaaccaaa gggtaaggat 



<400> 226 

atgaagcatt 

ataaaagggt 

gtcgtttttc 

ggtataaatc 

gctccctctt 

gccaagccat 

gcagactgga 

atcatcctga 

aagtttatat 

gtttctatgg 

tccaagacaa 

ctaagcctca 

gtccaattaa 

ctagctccta 

acggaatacg 

gagaaagtag 

gaagtttttt 

1020 

tactacaaca 
1080 

acgacacaaa 
1140 

gggttcttcg 
1191 



cccaaaaa ta 
tgcaaaggct 
agatctgtct 
aaagtaagaa 
ctttgaatgt 
gctttagatt 
tttatctctg 
agaaagctct 
ttttaagtag 
acttaaacgc 
atgaatccat 
agacaagaga 
gacatttgtt 
gtttagatgc 
tcggcaccaa 
atttttatat 
tcaattggtt 



ccgtaggtat 
gcttatcgct 
acaggtgctt 
ggcttttatc 
cacttttgaa 
taaagacagg 
gtggctttcc 
gcagtacata 
gaactggcaa 
gaggtgcaag 
tgccgcttat 
aaaaagcgag 
attaccgcac 
tatctacgat 
attcaccttg 
tagggaat-tt 
actgggcgtg 



ggaatttatg 
tgcttgttca 
ctccct tgga 
gttttattat 
acatcgcggc 
gctataataa 
tttgtttcaa 
ccattactgg 
aaggatgaga 
gggcccctta 
aatttaatca 
gcattctgtc 
tctaaaggct 
gtcactattg 
aagaaaatat 
agagttaatg 
tggaaagaaa 



aaaagactgg 
tttcaggctc 
gcaagattag 
gcatgatctt 
cattgaagaa 
ttgcaaatca 
atttgggtgg 
gatttggcat 
aagctttaac 
caaattataa 
tgttccctga 
aaagagcaca 
tgaagtttgc 
gatattctcc 
tcttaatggg 
agatcccttt 
aagatcaact 



taatcccttt 
gctgagtatt 
atttcaaaat 
gaacatggtg 
ctcttctaac 
tcaaatgtat 
taacgtttat 
gcgaaatttt 
aaatagtttg 
gagttgttat 
gggtacaaat 
tttggaccat 
agtagaaaaa 
cgccttgaga 
tgtctatccg 
gcaagatgac 
gctagaagac 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



caggccaatt taaaagtaat gctaaaaatg acaaccaatc catcgttgtt 
cgactggatt tcagcacgaa acattgacac cccgtatcct ttcatattac 
cttttcttat tcttgtattt gtgatgaaaa aaaatcattg a 
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<210> 227 
<211> 1440 
<212> DNA 

<213> Saccharomyces sp. 



<400> 227 

atgggttttg 

ttagatattt 

taccttcatt 

ttctgtctta 

gagtacaaaa 

actttagaca 

accttcgaaa 

ttttttaaag 

gactacacat 

tgggaatttc 

tttcttggtt 

ttcaaagatg 

aacaatcaag 

attcaaagaa 

ccaagattta 

agaaagaaaa 

gacaaattag 

1020 

ccgcccatga 
1080 

aagcccaagg 
1140 

gtcaaatata 
1200 

tatcaattag 
1260 

gaaaacaaca 
1320 

aCticaaccat 
1380 

atagaaaaag 
1440 



ttgatttctt 
ctgattggtt 
ctttttttac 
gactgcat tt 
ttcggctgct 
aggtgctgga 
aaaaaaaaaa 
atagcaaatt 
tgattaatta 
tacaaaagct 
ggggaaaaat 
aaacactcgc 
ctattactat 
aattacacca 
aaaactttac 
accgtaacaa 
ttcacaagag 

tcgtagataa 

tggtacgaat 

ctgatagtgg 

agcaaatatc 

taaaggataa 

tactcgcata 

atagattaat 



cgaaacatat 
gagtctgacc 
tgcaatcaat 
actatatgac 
ctcgagggca 
ggcgattgaa 
cgtccaaatt 
ccaaaccaca 
cctttttctc 
gaggaagggg 
gtttaacttt 
actctcatcg 
ttttcccgaa 
agattttccc 
cactttgatg 
tataatcaaa 
catgaaaatg 



atggtcggtt 
ccaaggttgc 
caattcctac 
agattttggt 
ctgacatata 
atttggtttc 
ttcataaccg 
ttaatgatat 
aaaagttgtc 
gaagatctag 
cctcgattgg 
aatgagttaa 
gtcaatatca 
tttgttataa 
gctgcttttt 
gaggcccgat 
gagtcttcca 



ttcatactta cttacaaaaa 



caat-ccatac 
gcatgatcat 
tccgacaatc 
ggaccccatt 
caatgatgag 
cgagtccttg 



atatatgatg 
accaacggag 
tttgagatga 
gttgtgatgg 
agtttagaaa' 
caaaaaaata 



ctagggtcca 
ttattctttt 
agttcattaa 
cgcatgtgcc 
gtaaactgaa 
agctacattt 
agggaagtga 
gtaatcatcg 
ccaccaagtt: 
ctgaatggcc 
atctactaaa 
gagatatttt 
tgagtttgga 
actt:ctataa 
catcaattaa 
acctgtttca 
aggtatccga 

aggaagaaat: 

tcaccataat 

atttgagact 

ttcaaccaga 

taaatgtaaa 

agtggcttga 

t:t:aaaatt:ga 



gttcaaacag 
tggctatttt 
cacgaattcc 
cataataggt 
aataatacca 
agttgaaatg 
tgacctaaac 
atcagtgaat 
t:tat:act:aaa 
tcagttaaaa 
gaacatattc 
agaaagacaa 
actatcaatt 
tttattatac 
aaacatcaaa 
cagagaactt 
taagacgacg 

cagcagcggc 

ttattaccga 

tcataaaggt 

aatggagtct 

aaagcaticaa 

aaataggtgg 

gaccaaataa 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



<210> 228 
<211> 903 
<212> DNA 

<213> Saccharomyces sp. 



<400> 228 

atggaaaagt 

acaatcagga 

accattataa 

ggtttgatat 

aaacgtgacg 

tgtacctcac 

ttggtcccat 

ctcgccggtg 

ggcaataccg 

ccgtttagta 

aaccccgcaa 

aaaactgcca 

aagggcatta 

gaattacgcg 

gttgaatcta 

tag 



acaccaattg 
aacctagtaa 
tgctaccatt 
tgaagtttac 
taaggaaatc 
ctttagatgc 
ccaatgacat 
ggttagatat 
tgaattttat 
taaccgggaa 
tggccaaaac 
tcaccacatt 
atgttaaatg 
ttgcattaaa 
agaggaattt 



gagagacaat 
ggtgatgaca 
gattatgctg 
attcagttgg 
caagcattat 
tttttcagtg 
tgtatacaaa 
aaaactctat 
gtttgctgag 
aaaacttaaa 
taaaaaattt 
gcccatctcc 
caagatcaac 
cggtggcgac 
tgtgaaggaa 



ggtacgggaa 
gcgtgtttgt 
taccttctaa 
aaagaggaaa 
ccacagaagg 
gtgttattag 
gtttccataa 
ggccacgagg 
ggtacctcat 
gaattcatag 
gaattgcaga 
aatatggagt 
gagccacaag 
aaatatiaaac 
tatatcagcg 



tagctccatt 
tgggtatcct 
ctggccagaa 
ttaccgtgca 
gcaagcttta 
ctcaagggcc 
gagaattcat 
tagcagagct: 
gtaatggtaa 
acccttcaat 
ccatccaaat 
atttatctag 
tactctcgga 
tagtctcacg 
atcaacgtiaa 



tctaccaaac 
aggggtgaaa 
caacttactg 
aggaatcaag 
tatttgcaat 
tgttacgttg 
caacttcatc 
atctcaattg 
aagcgtctta 
aaccacaatg 
caaaactaat. 
atttctgaac 
taatttagag 
gaagttiagat 
aaagaggaag 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

903 



<210> 229 
<211> 2280 
<212> DNA 

<213> Saccharomyces sp. 
<400> 229 

atgcctgcac caaaacccac ggagaaattt gcctcttcca agagcacaca gaaaactiacg 60 
aattacagtt ccatcgaggc caaaagcgtc aagacgtcgg ctgatcaggc atacatctac 120 
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caagagccta gcgctaccaa gaagatactt tactccatcg ccacatggct gtegtacaac 180 
atcttccact gcttctttag agaaatcaga ggccggggca gtttcaaggt accgcaacag 240 
ggaccggtga tctttgttgc ggctccgcat gctaaccagt tcgt<:gaccc tgtaatcctt 300 
atgggcgagg tgaagaaatc tgtcaacaga cgtgtgtcct tcttgattgc ggagagctca 360 
ttaaagcaac cccccatagg gtttttggct agtttcttca tggccatagg cgCggCaagg 420 
ccgcaggata atttgaaacc ggcagaaggt actatccgcg tagatccaac agactacaag 480 
agagttatcg gccacgacac gcatttcttg actgattgta tgccaaaggg tctcatcgg^ 540 
ttacccaaat caatgggatt tggagaaatc cagtccatag aaa^tgacac gagtttgacc 600 
ctaagaaaag agttcaaaat ggccaaacca gagattaaaa ctgctttact caccggcact 660 
acttataaat atgccgctaa agtcgaccaa tcttgcgttt accatagagt ttttgagcat 720 
ttggcccata acaactgcat tgggatcttt cctgaaggtg ggtcccacga cagaacaaac 780 
ttgttgcccc tgaaagcagg tgtggcgatt atggctcttg gttgcatgga taagcatcct 840 
gacgtcaatg ttaagattgt tccctgcggt atgaattatt tccatccaca taagttcagg 900 
tcgagagcgg ttgttgaatt cggtgacccc attgaaatac cgaaggaact agtcgccaag 960 
taccacaacc cggaaacgaa cagagatgca gtgaaagaat tattagatac ca^atcgaag 
1020 

ggtttacaat ccgttaccgt tacatgttct gattatgaaa ctttgatggt ggttcaaacg 

1080 

ataagaagac tatatatgac acaatttagc accaagttac cgttgccctt gattgtggaa 

1140 

atgaacagaa gaatggtcaa aggttacgaa ttctatagaa acgatcctiaa aatagcggsic 

1200 

ttgaccaaag atataatggc atataatgcc gccttgagac actataatct tcctgatcac 
1260 

cttgtggagg aggcaaaggt aaatttcgca aaaaacctcg gacttgtttt ttttagatcc 
1320 

atcgggctct gcatcctctt ttcgttagcc atgccaggta tcattatgtt ctcacctgtc 

1380 

ttcatattag ccaagagaat ttctcaagaa aaggcccgta ccgctttgtc caagtctaca 
1440 

gttaaaataa aggctaacga tgtcattgcc acgtggaaaa tcttgattgg gatgggattt 
1500 

gcgcccttgc tttacatctt ttggtccgtt ttaatcactt attacctcag acataaacca 
1560 

tggaataaaa tatatgtttt ttccgggtct tacatctcgt gtgttatagt cacgtattcc 
1620 

gccttaatcg tgggtgatat tggtatggat ggtttcaaat ctttgagacc actggtttta 
1680 

tctcttacat ctccaaaggg cttgcaaaag ctacaaaagg atcgtagaaa tctggcagaa 
1740 

agaataatcg aagttgtaaa taactttgga agcgaattat tccccgattt cgatagtgcc 
1800 

gccctacgtg aagaattcga cgtcatcgat gaagaggaag aagatcgaaa aacctcagaa 
1860 

ttgaatcgca ggaaaatgct aagaaaacag aaaataaaaa gacaagaaaa agattcgtca 
1920 

tcacctatca tcagccaacg tgacaaccac gatgcctatg aacaccataa ccaagattcc 
1980 

gatggcgtct cattggtcaa tagtgacaat tccctctcta acattccatt attctcttct 
2040 

acttttcatc gtaagtcaga gtcttcctta gcttcgacat ccgttgcacc ttcttcttcc 
2100 

tccgaatttg aggtagaaaa cgaaatcttg gaggaaaaaa atggattagc aagtaaaatic 
2160 

gcacaggccg tcttaaacaa gagaattggt gaaaatactg ccagggaaga ggaagaggaa 
2220 

gaagaagagg aagaagaaga agaggaagaa gaagaagaag ggaaagaagg agatgcgtag 
2280 

<210> 230 
<211> 2232 
<212> DNA 

<213> Saccharomyces sp . 
<400> 230 

atgtctgctc ccgctgccga tcataacgct gccaaaccta ttcctcatgt acctcaagcg 60 
tcccgacggt acaaaaattc atacaatgga ttcgtataca atatacatac atggctgtat 120 
gatgtgtctg tatttctgtt taatattttg ttcactattt tcttcagaga aattaaggta 180 
cgtggtgcat ataacgttcc cgaagttggg gtgccaacca tccttgtgtg tgcccct-cat 240 
gcaaatcagt tcatcgaccc ggctttggta atgtcgcaaa cccgtttgct gaagacatca 3 0O 



wo 00/18889 



69 



PCT/US99/22231 



gcgggaaagt cccgatccag aatgccttgt tttgttactg ctgagtcgag ttttaagaaa 3 60 
agatttatct ctttctttgg tcacgcaatg ggcggtattc ccgtgcctag aattcaggac 420 
aacttgaagc cagtggatga gaatcttgag atttacgctc cggacttgaa gaaccacccg 4 80 
gaaatcatca agggccgctc caagaaccca cagactacac cagtgaactt tacgaaaagg 540 
ttttctgcca agtccttgct tggattgccc gactacttaa gtaatgctca aatcaaggaa 600 
atcccggatg atgaaacgat: aatcctgtcc tctccattca gaacatcgaa atcaaaagtg 660 
gtggagctct tgactaatgg tactaatttt aaatatgcag agaaaatcga caatacggaa 720 
actttccaga gtgtttttga tcacttgcat acgaagggct gtgtaggtat tttccccgag 780 
ggtggttctc atgaccgtcc ttcgttacta cccatcaagg caggtgttgc cattatggct 840 
ctgggcgcag tagccgctga tcctaccatg aaagttgctg ttgtaccctg tggtttgcat 900 
tatttccaca gaaataaatt cagatctaga gctgttttag aatacggcga acctatagtg 960 
gtggatggga aatatggcga aacgtataag gactccccac gtgagaccgt ttccaaacta 
1020 

ctaaaaaaga tcaccaattc tttgttttct gttaccgaaa atgctccaga ttacgatact 
1080 

ttgatggtca ttcaggctgc cagaagacta tatcaaccgg taaaagtcag gctacctttg 
1140 

cctgccattg tagaaatcaa cagaaggtta cttttcggtt attccaagtt taaagatgat 
1200 

ccaagaatta ttcacttaaa aaaactggta tatgactaca acaggaaatt agattcagtg 
1260 

ggtttaaaag accatcaggt gatgcaatta aaaactacca aattagaagc attgaggtgc 
1320 

tttgtaactt tgatcgttcg attgattaaa ttttctgtct ttgctatact atcgttaccg 
1380 

ggttctattc tcttcactcc aattttcatt atttgtcgcg tatactcaga aaagaaggcc 
1440 

aaagagggtt taaagaaatc attggttaaa attaagggta ccgatttgtt ggccacatgg 
1500 

aaacttatcg tggcgttaat attggcacca attttatacg ttacttactc gatcttgttg 
1560 

attattttgg caagaaaaca acactattgt cgcatctggg ttccttccaa taacgcattc 
1620 

atacaatttg tctattttta tgcgttattg gttttcacca cgtattcctc tttaaagacc 
1680 

ggtgaaatcg gtgttgacct tttcaaatct ttaagaccac tttttgtttc tattgtttac 
1740 

cccggtaaga agatcgaaga aatccaaaca acaagaaaga atttaagtct agagttgact 
1800 

gctgtttgta acgatttagg acctttggtt ttccctgatt acgataaatt agcgactgag 
1860 

atattctcta agagagacgg ttatgatgtc tcttctgatg cagagtcttc tataagtcgt 
1920 

atgagtgtac aatctagaag ccgctcttct tctatacatt ctattggctc gctagcttct 
1980 

aacgccctat caagagtgaa ttcaagaggc tcgttgaccg atattccaat tttttctgat 
2040 

gcaaagcaag gtcaatggaa aagtgaaggt gaaactagtg aggatgagga tgaatttgat 
2100 

gagaaaaatc ctgccatagt acaaaccgca cgaagttctg atctaaataa ggaaaacagt 
2160 

cgcaacacaa atatatcttc gaagattgct tcgctggtaa gacagaaaag agaacacgaa 
2220 

aagaaagaat: ga 
2232 

<210> 231 
<211> 1194 
<212> DNA 

<213> Saccharomyces sp. 
<400> 231 

atgctgcatc aaaaaatagc tcataaagtt cgaaaagtcg tcgtcccagg 
ttgattttct tccagggatg ccttattctt ttgtttctcc aactcaccta 
tactgtagaa atgatataag gaaacaaatt ggtctcaata aaaccaaaag 
gtcttggtat catccatttt gcatgttgtc gcaccatctg cagtgagaat 
aattccagtg ttcctaaagg tacttttttt ttagacttga agaagaaaag 
catctaaagt ccaattcggt ggccatttgc aatcaccaaa tatacacgga 
ttatggtggt tggcttacac atcgaactta ggggctaatg tcttcattat 
tcgttggctt ccattcctat cctcggtttc ggtatgagaa actataattt 



tatttcctta 60 
taagactctt 120 
attatttatt ISO 
taccactgaa 24 0 
gattctttct 300 
ttggatattt 360 
tttaaaaaaa 420 
catttttatg 480 
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agtagaaagt gggcacaaga caaaataacc ctaagcaaca gccttgctgg ccttgattcg 540 

aat;gcaaggg gcgccggctc acttgctgga aagtcacctg agcgcataac tgaggaagga 600 

gagagcatat: ggaatccgga ggttattgat ccaaaacaaa tccattggcc atacaaCctt 660 

atcctattcc ctgaaggtac aaatctcagt gctgatacta ggcaaaaaag tgctaaatat 720 

gctgccaaaa taggcaaaaa gccattcaag aatgtgctac tgcctcattc tacaggccta 780 

agatactcgt tacaaaagtt gaagccaagt attgaaagtc tttatgatat tacgatcggc 840 

tactccggtg taaaacagga ggaatatggt gagct:tat:at atgggctgaa gagcatattt 900 

ttagaaggaa aatacccgaa gttagtcgat attcacatca gagcatttga tgttaaagat 960 

attccattag aggacgagaa tgaattttca gaatggctgt ataaaattcg gagtgagaag 
1020 

gatgctctaa tggaaaggta ctattccact ggatcattcg taagtgatcc tgaaacaaac 
1080 

cattcagtta ccgatagttt caagatcaat cgtattgagt taactgaagt gctaatatta 
1140 

ccaactctaa caataatttg gttagtttat aaactttatt gttttatttt ttga 
1194 



<210> 232 
<211> 912 
<212> DNA 

<213> Saccharomyces sp. 



<400> 232 

atgagtgtga 

gcaggctgtg 

catttggctc 

cttgacgtca 

aatcaccaat 

gttactgcca 

ggtacatatt 

ttagaaaatg 

tacacgagtg 

ggtaagatcc 

tatggggtct 

aacttaacaa 

actttgaagg 

attgagtatg 

gtgccttctg 

aagatgcatt 



taggtaggtt 
gcttttacgg 
agtggattac 
aggtcgttgg 
ccaccttgga 
agaagtcttt 
tcttagacag 
tLtaagaaaaa 
agctgacaat 
ccattgttcc 
tcaacagagg 
aggacaaaat. 
agattggcta 
ccgctcttca 
tcagcattag 
aa 



cttgtattac 
tgtaatcgcc 
tgcgcgttgt 
cgaggagaat 
tatcttcatg 
gaaatacgtc 
atctaaaagg 
caagcgtgct 
gttgcctttc 
agtggttgtt 
ctgtatgatt 
tggtgaattt 
ctctcccgcc 
acatigacaag 
caacgatgtc 



ttgaggtccg 
tctatccttt 
ttttaccatg 
ttggccaaga 
ttaggtagga 
ccctttctgg 
caagaagcca 
ctatgggttt 
aagaagggtg 
tccaatacca 
gttagaattt 
gctigaaaaag 
atcaacgata 
aaagtgaaca' 
aatacccata 



tgttggtcgt 
gcacgttaat 
tcatgaaatt 
agccatatat 
ttttcccccc 
gttggttcat 
ttgacacctt 
ttcctgaggg 
ctttccattt 
gtactttagt 
■ taaaacctat 
ttagagatca 
caaccctccc 
agaaaatcaa 
acgaaggttc 



actggcgctt 
cggtaagcaa 
gatgcttggc 
tatgattgcc 
tggttgcaca 
ggctttgagt 
gaataaaggt 
taccaggtct 
ggcacaacag 
aagt:cct.aaa 
ttcaaccgag 
aatggttgac 
accacaagct 
gaatgagcct 
atct:gtaaaa 



60 

120 

180 

240 

300 

3 60 

420 

480 

540 

600 

660 

720 

780 

840 

900 

912 



<210> 233 
<211> 54 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 233 

cgcgatttaa atggcgcgcc ctgcaggcgg ccgcctgcag ggcgcgccat ttaa 54 

<210> 234 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 234 

tcgaggatcc gcggccgcaa gcttcctgca gg 32 



<210> 235 
<211> 32 
<212> DNA 
<213> Artificial 



Sequence 



<220> 



wo 00/18889 



71 



<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 235 

tcgacctgca ggaagcttgc ggccgcggat cc 



<210> 236 
<211> 32 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 236 

tcgacctgca ggaagcttgc ggccgcggat cc 



<210> 237 
<211> 32 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 237 

tcgaggatcc gcggccgcaa gcttcctgca gg 



<210> 238 
<211> 36 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 238 

tcgaggatcc gcggccgcaa gcttcctgca ggagct 



<210> 239 
<211> 28 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 239 

cctgcaggaa gcttgcggcc gcggatcc 



<210> 240 
<211> 36 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Ol igonucleot ide 

<400> 240 

tcgacctgca ggaagcttgc ggccgcggat ccagct 



<210> 241 
<211> 28 
<212> DNA 
<213> Artificial 



Sequence 
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<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 241 

ggatccgcgg ccgcaagctt cctgcagg 



